TRIUMFAtlasTier3

From ATLAS-TRIUMF

Jump to: navigation, search

[edit] Tier-3 @ TRIUMF

For our local group's computing needs, we have a Tier-3 cluster, that currently consists of a:

  • 2x20TB Raid-6 storage servers (2 and 4 cores respectively)
  • 2 dual quad-core machines
  • 4 dual hexa-core machines which each have 16 TB Raid-6 data disks

The storage server is configured with xrootd to serve the files and also acts as the head node for the PBS batch system.

The Tier-3 uses the ATLASLocalRootBase and is part of the TRIUMF ATLAS NIS cluster.

Storage Server

To log into the older storage nodes do: ssh atlas-tier3-ds02.triumf.ca (or ds03)

There are two visible disks with ~8TB each /data/ds02_1 and /data/ds02_2 that are Raid-6 configured. Please store your data there. So far there are no restrictions, so please use the available diskspace responsibly.

The new machines atlas-tier3-c[10-13].triumf.ca also contain considerable disk, aggregated into a /global filesystem using GlusterFS. This volume is available to all ds02/ds03 and c[8-13]. Please put your data in /global/username. Currently no quotas are enforced on this disk. If the 64TB starts to disappear, quotas may be trivially enforced.

Note that the /global volume collects data from /srv/data disks on all the c[10-13] machines. This means that /global/foo/bar file will be located at /srv/data/foo/bar on one of the machines. It is strongly encouraged to write to /global and not to /srv/data on c[10-13]. This is because writing to /global will enforce automatic load balancing at the time of writing, so that files are evenly distributed among the machines. Otherwise GlusterFS will periodically rebalance, which is a time-consuming task, and perplexing to the user who finds that a file has been moved from one machine to another.

To install a client to mount /global and gain access you need a 64-bit SL5 machine. The install glusterfs-core-3.2.2-1.x86_64.rpm and glusterfs-fuse-3.2.2-1.x86_64.rpm from the GlusterFS download pages. Issuing the command as root mkdir /global && mount -t glusterfs atlas-tier3-c10.triumf.ca:/global /global will create a /global mount.

Batch System

The Tier-3 uses the PBS batch system. Jobs are for now submitted from atlas-tier3-ds02 with qsub and monitored with qstat. For qstat, option -r gives you all running jobs, -a all jobs, -q all queues and -f jobid full detail of the corresponding job. Here is a nice reference for user commands etc. Currently there are three queues, short, medium and long with the following configuration:

Queue Memory CPU Time Walltime Node Run Que Lm State


long -- 48:00:00 72:00:00 -- 0 0 -- E R

medium -- 06:00:00 12:00:00 -- 0 0 -- E R

short -- 00:15:00 00:30:00 -- 0 0 -- E R

The 16 nodes of the two new dual quad-core machines are currently included in the cluster and their status can be checked with pbsnodes:

atlas-tier3-c8.triumf.ca

    state = free
    np = 8
    ntype = cluster
    status = opsys=linux,uname=Linux atlas-tier3-c8.triumf.ca 2.6.9-78.0.13.ELsmp #1 SMP Wed Jan 14 13:05:22 CST 2009 x86_64,
    sessions=? 0,nsessions=? 0,nusers=0,idletime=2500258,totmem=5945064kb,availmem=5838544kb,physmem=16430832kb,ncpus=8,
    loadave=0.00,netload=39541887,state=free,jobs=? 0,rectime=1240527393

atlas-tier3-c9.triumf.ca

    state = free
    np = 8
    ntype = cluster
    status = opsys=linux,uname=Linux atlas-tier3-c9.triumf.ca 2.6.9-78.0.13.ELsmp #1 SMP Wed Jan 14 13:05:22 CST 2009 x86_64,
    sessions=? 0,nsessions=? 0,nusers=0,idletime=142580,totmem=5945064kb,availmem=5820212kb,physmem=16430832kb,ncpus=8,
    loadave=0.00,netload=66485121,state=free,jobs=? 0,rectime=1240527354

This is an example for job submission I recently tested for parallel processing of a cosmic ray sample using ten nodes.

Example

The submission script submit_job.sh takes three inputs, the list of input files, the first and last section. It calls the second script run_job_cosmic.sh, which contains the actual submission for the individual PBS jobs.

E.g. In the example I do: ./submit_job.sh cosmic_list.txt 1 10

submit_job.sh

run_job_cosmic.sh

In the example, the data is read from the storage server disk and the output is written to the local /tmp on the worker nodes to reduce IO. At the end of the job the output is copied to the storage server. The /tmp disk has 100 GB, so please keep your temporary job output below ~10 GB and don't forget to clean up at the end of your job.

If you are interested in running a vnc client, here are instructionson how to do that.

Personal tools