Center for Computational Molecular Science and Technology Georgia Institute of Technology Center for Computational Molecular Science and Technology School of Chemistry and Biochemistry

CCMST Weekly News, July 15 2011

July 15, 2011 8:19 pm EDT

1. Statistics
2. Tip of the Week

STATISTICS

GGATE

Uptime: 27 days
/home directory usage: 33% (7.4 TB available)
/backups directory usage: 77%

Utilization for period from: 06/14/2011 to: 07/14/2011

Note: Full statistics for ggate are available online at: http://ggate.chemistry.gatech.edu:8080 (the link works only for the Gatech Chemistry network)

FGATE

Uptime: 40 days
/home directory usage: 60% (2.4 TB available)
/backups directory usage: 100%

LSF usage for Week 27 (7/4-7/10) (times are in minutes)
GroupJobsTotal CPUAvg CPUAvg WaitAvg Trnr.
Bredas 297 103393 5% 348 236 731
Hernandez 3 20786 1% 6929 4331 11262
Sherrill 104 481654 25% 4631 1224 6520
Other 1 107900 6% 107900 0 36010
Total 405 713733 37% 1762 520 2383

Note: percentages refer to the total CPU time available for the period.

Most productive user of the Week: loriab 413873.

EGATE

Uptime: 130 days
/theoryfs/common directory usage: 41% (396 GB available)
/theoryfs/ccmst directory usage: 95% (49 GB available)

LSF usage for Week 27 (7/4-7/10) (times are in minutes)
GroupJobsTotal CPUAvg CPUAvg WaitAvg Trnr.
Hernandez 98 682839 45% 6968 5825 13071
Sherrill 178 782414 52% 4396 789 5373
Total 276 1465252 97% 5309 2577 8106

Note: percentages refer to the total CPU time available for the period.

Most productive user of the Week: galen 537960.

TIP OF THE WEEK

By Massimo

Job Arrays on Moab/Torque

Quite often in our work we are presented with the task of submitting a large number of similar jobs, where the same type of calculation has to be repeated for a large set of different input data. There are several ways of approaching this task, from the simple (but time consuming and error prone) solution of manually copying and editing input files and submission scripts, to the most sophisticated of having a script controlling the generation and submission of jobs.

One other possibility is to take advantage of the job array feature of Moab/Torque, and have the scheduler itself generate the array of jobs for you. Using this feature, it is possible to have a single submission script generating an arbitrary number of similar jobs. A job array is generated using the -t option of the qsub command. For instance:

qsub -t 1-10 myjob.pbs

Will generate an array of 10 jobs, based on the submission script myjob.pbs. Each job of the array will be identified by the environment variable PBS_ARRAYID, which, in the above example, will assume values in the range 1-10. The value of this variable can be used inside myjob.pbs to generate the appropriate input for the job. For instance, say that one has to run the same program for 10 different input files: case_1.inp case_2.inp ... case_10.inp. This could be achieved with the following myjob.pbs script:

#!/bin/bash
#PBS -N job_array
#PBS -l nodes1:ppn=8
#PBS -l pmem=1000mb
#PBS -l walltime=01:00:00:00
#PBS -t 1-10

myprogram < case_${PBS_ARRAYID}.inp > case_${PBS_ARRAYID}.out

This will generate 10 jobs, each requesting 8 processors, 1GB of memory and one day of computer time, executing the above input files.

In a job array, it is possible to control how many jobs of the array are executed at the same time (note that this will not override the limits on number of jobs and processors per user imposed by the scheduler):

qsub -t 21:100%5

The above command will generate 80 jobs (with IDs in the range 21 through 100), and at most 5 jobs of the array will be running at the same time. To learn more about job arrays, please consult the manual for the qsub command.

Do you have usage tips that you want to share with the other CCMST users? Please send them to Massimo (massimo.malagoli@chemistry.gatech.edu) for inclusion in the Tip of the Week section.