ccmst logo
Search CCMST:


Facilities Quick Navigation:

Policies for Use

CCMST computer resources are for authorized research and educational purposes only. Use of CCMST computer facilities is subject to campus computer and network usage, data access, and World Wide Web policies developed by OIT. Any use in violation of this policy may lead to termination of your CCMST computer accounts.

All published work that includes calculations performed on CCMST computers should include the following statement in their acknowledgements: "Computations were supported by the Center for Computational Molecular Science and Technology at the Georgia Institute of Technology and partially funded through a Shared University Research (SUR) grant from IBM and the Georgia Institute of Technology." A paper or electronic copy of the work should also be provided to the CCMST to help us satisfy our reporting requirements.

Computing Resources

Computing needs of CCMST members and external users are served by the CCMST IBM RS/6000 SP, referred to as the CCMST SP for brevity. The CCMST SP consists of 18 "WinterHawk II" nodes connected by an SP switch which provides high-bandwidth communication between the nodes. Each node has four 375-MHz POWER3-II processors, for a total of 72 processors. The machine has a total of 47 GB RAM and 756 GB temporary disk storage. The control workstation, a model IBM RS/6000 F50, hosts an additional 108 GB of long-term disk storage.

Installed Application Software

Software Title Version Description
IDL 5.4 Powerful data analysis and visualization package
Jaguar 4.1.059 Very robust ab initio quantum chemistry package for treating large systems
Molcas 5.0 Ab initio quantum chemistry package useful for treating molecular systems with degeneracies
MPQC 2.1.3 Next generation massively parallel ab initio quantum chemistry package
NAMD 2.5b1 Parallel classical molecular dynamics package
Psi 3.2 Ab initio quantum chemistry package under active development by a number of groups, including CCMST
Q-Chem 2.0 Robust production level ab initio quantum chemistry package

For more specific information on installed software refer to Software section of our website.

How to Access the CCMST IBM SP System

All access to the SP occurs through control workstation (CW) cgate.chemistry.gatech.edu only. The CW is the central server for the system. It maintains user databases, hosts user's home directories and software installations, and distributes user's jobs among the nodes of the SP. To access the CW you must have a valid account which may be obtained by filling out an account application.

After you have established an account, the next step is to login to to the control workstation. A remote login to cgate.chemistry.gatech.edu is possible with an SSH-enabled client only (see below for more information on SSH). When you log into the system for the first time you should become familiar with the directory structure of the control workstation. Most user home directories exist under directory /cgate/common, i.e. user yourusername's home directory would be in /cgate/common/yourusername. That's your permanent storage area, i.e. files in that area will never be removed. However, each user is allowed to store only up to 100 Megabytes of data in their home directory. Users may temporarily store larger files in /cgate/scratch/yourusername. There is no quota on the amount one user can store in that area, but old files will be cleaned up automatically.

Directory structure on the nodes of the SP, where actual jobs are run, is almost identical. /cgate/common/yourusername and /cgate/scratch/yourusername are accessible from every node. In addition, each node has its own additional high-bandwidth storage for temporary files. To access that area, your jobs can read and write to /scratch/yourusername. The amount of available space in that directory varies from node to node but is at least 15 GBytes.

A possible strategy for using the aforementioned storage areas might involve using node's /scratch/yourusername to store heavily accessed and/or large temporary files, /cgate/scratch/yourusername to store some temporary or data files which you want to transfer later to your local machine for further analysis, and /cgate/common/yourusername to store permanent application data, job input and output files, etc.

To transfer files between your machine and the SP control workstation you have two choices:

There are a number of choices available for editing text files. The most common UNIX editor is vi. If you are completely unfamiliar with vi, we recommend you read this online vi tutorial. For more experienced users, we may recommend this vi reference manual. Another popular choice for text file editing is emacs. It is rather more powerful than vi. If you want to learn how to use emacs refer to this tutorial.

How to Use the Graphical Capabilities of the Control Workstation

Some useful utilities and software packages produce visual data or require graphical user interface for their normal execution. If you have a Unix workstation with a graphical window manager running, then it is very easy to use such programs. When you log into the control workstation using a secure shell client, you will be able to execute graphical commands and see their output on your screen.

The situation is more difficult for Windows and MacOS users. You might have to install special X server software to be able to utilize graphical programs on cgate. We provide the last free version of MicroImages MI/X server for windows, TNTlite MIXServer v5.6. MicroImages also makes the MI/X server for MacOS, which is made freely available here (both PowerPC and 68k series versions are available). If you want a full featured X Windows Server, they can be purchased. Here is a listing of available commercial X Windows Servers:

Compiling and Installing Software

If there is a software package that you need for your work, you may have it installed for you, or you may get a permission to install it yourself. Only licensed software products or freeware obtained from established sources may be installed on the CCMST SP. You may obtain a permission to install the software package or request assistance with installing the software by contacting CCMST system administrator.

The CCMST SP has also a rich set of development tools to enable users to develop their own computer codes. Those include IBM C for AIX compiler version 5, IBM Visual Age C++ compiler version 5, IBM XL Fortran compiler version 7.1, Perl 5.5, GNU make version 3.79.1, GNU autoconf version 2.13, GNU bison version 1.28, GNU flex version 2.5.4, IBM Engineering and Scientific Subroutines Library (ESSL) version 3.2, IBM Parallel ESSL version 2.2, IBM Parallel Operating Environment 3.1 with Message Passing Library (MPI), libpthreads library, and more. Documentation for the IBM development tools can be found on control workstation by logging into cgate, making sure an X server on your machine is running (see the section on how to access the machine), and executing command netscape http://control/ Alternatively, one may visit IBM AIX documentation library. Manual pages for GNU tools can be read on cgate via the man command. Additional information on software development tools may be found in Software section.

Running Jobs

Introduction
All production jobs can only be executed on the SP via an IBM queueing software called LoadLeveler (current version 2.2). This program keeps track of all the jobs that have been submitted, prioritizes them, and distributes them among the SP nodes when requested resources (memory ,disk, processors) become available. It can handle both serial (single-processor) and parallel jobs.

Here we give a brief tutorial on how to start using LoadLeveler. Refer to the user guide to LoadLeveler for more information.

How to submit a job to LoadLeveler
There are two ways to submit a job to LoadLeveler. You may prepare a command file using your favorite text editor, and feed it LoadLeveler with the llsubmit command. Or, if you have an X server running on your machine, you may use xloadl - an easy-to-use graphical user interface to LoadLeveler. Even if you plan to use xloadl exclusively, we strongly recommend you learn about the syntax of LoadLeveler command files first.

Constructing LoadLeveler Command Files

Each job needs a LoadLeveler command file; we will assume for the sake of convention that these are named with a .cmd suffix, although this is not required. A typical LoadLeveler command file for running an executable my.exe in a single-processor mode might look as follows:


#!/bin/csh
# @ job_type = serial
# @ initialdir = /cgate/common/username/chem
# @ notify_user = your@email.address
# @ account_no = useraccount
# @ input = /dev/null
# @ output = $(jobid).stdout
# @ error = $(jobid).err
# @ class = cpu
# @ notification = complete
# @ checkpoint = no
# @ restart = no
# @ requirements = (Arch == "power3") && (OpSys == "AIX43")
# @ queue

my.exe file.in file.out
    
This sample job runs the command my.exe file.in file.out in the directory /cgate/common/username/chem (you need to ensure that the command my.exe is in your path). After completion of the job, an email message is sent to your@email.address. This job is charged to account useraccount. As configured, this job will read standard input from /dev/null (which means it is a batch job and does not require any user input), and send standard errors and standard output to $(jobid).err and $(jobid).stdout, respectively, where $(jobid) is the id number assigned to the job by LoadLeveler when the job is submitted. Alternatively, the user may wish to rename these files according to the job being run (for example, file.stdout and file.err). These files are usually only of interest if the job crashes, since they might contain debugging information (although some programs will send their usual output to stdout). Typically, these files should be cleaned up after the results of the job have been checked.

Note the class = cpu setting. LoadLeveler is configured with several different classes (think of them as queues), and this is where the class is selected. A complete list of valid classes is listed below.

Class name CPU Time Limit Memory Limit Comments
interactive 30 minutes 1 Gigabyte For very short jobs
quick 24 hours 1 Gigabyte For medium length CPU-intensive jobs
cpu 14 days 1 Gigabyte Most jobs fall into this category
long 90 days 2 Gigabyte Only 2 jobs of this class can run at the same time. For super-long jobs only.
io 14 days 1 Gigabyte Should be used for I/O-intensive medium to long jobs only
special 365 days 1 Gigabyte By default, noone is allowed to use this class. If you need to run jobs of this length, please, contact CCMST system administrator. Only 1 such job can be run at any time.

The CPU time limit is the limit on the total CPU time of the job. For parallel jobs, it means the sum of tasks' CPU times. Hence, while 90 days is a lot of CPU time, if you use 12 tasks, they will consume 90 days of CPU time in 7.5 days, assuming 100% execution efficiency. Hence, plan carefully to which class your jobs should belong.

Class cpu is intended for longer computations (up to 14 days, requiring up to 1GB RAM). Other classes include quick (up to 24 hours, up to 1GB RAM) and interactive (up to 30 minutes, up to 1GB RAM). Any job going beyond the specified limits is killed automatically.

To gain more confidence with LoadLeveler, let us submit a simple job. Unix provides a command hostname, which prints the canonical name of the machine on which it is run to stdout. Let's set up a job which will execute the hostname command on one of the nodes. The command file will look like this:


#!/bin/csh
# @ job_type = serial
# @ account_no = useraccount
# @ initialdir = /cgate/common/username
# @ notify_user = username@control
# @ input = /dev/null
# @ output = hostname.stdout
# @ error = hostname.err
# @ class = interactive
# @ notification = complete
# @ checkpoint = no
# @ restart = no
# @ requirements = (Arch == "power3") && (OpSys == "AIX43")
# @ queue

/bin/hostname

Just cut and paste this into some file (say, hostname.cmd), replace useraccount with the account name you were assigned when you registered with CCMST, and username with your actual user name. That simple. Note that we specified class = interactive, since this job will take no time at all to run.

Once the .cmd file is prepared, it's a simple matter to submit it to LoadLeveler. On cgate, simply type

    llsubmit hostname.cmd
    
where the LoadLeveler command file is named hostname.cmd. This will verify job's username, account name, and class, and feed the job to LoadLeveler, which will place it in the appropriate queue.

Perhaps the easiest and most practical way to create LoadLeveler command files is to use one of the generic templates provided in /home/loadl/templates. There you will find templates for serial and parallel jobs. Then use one of them to create a command file template specific for your purposes, and just use that template in your work.

Once a job or jobs have been submitted, their status may be monitored by the llq command, which lists all jobs in the queues. Since this is a fairly long list, a particular user's jobs can be querried using llq -u username.

Sometimes you may want to cancel a job that has been submitted, or even started running. To cancel a job jobid, issue llcancel jobid. To cancel all of your jobs, use llcancel -u username.

Note on jobs that require a lot of memory

Many computations, especially highly-accurate quantum computations, require a lot of memory. In order to avoid compute nodes running out of memory one should specify in command file the memory requirements of the job if they exceed 256 MB. It is done via the ConsumableMemory resource:

# @ resources = ConsumableMemory(1024)
In the example shown LoadLeveler will reserve 1024 MB of memory for this job. For parallel jobs the memory requirements are specified on a per-task basis (see here).

Note on parallel jobs

To run in parallel on the SP the software (obtained from outside source or your own) has to be written and compiled in a special way. Depending on which parallel model your software uses you will run it differently:

It is imperative that you specify memory requirements for the parallel program. The amount of ConsumableMemory resource (see here) should specify how much memory each MPI task will require.

Using Graphical User Interface xloadl

There is also a graphical front-end to LoadLeveler submission and querying, called xloadl. If you are logging in from a machine running the X-Window system, you may invoke the graphical program by typing xloadl &.

The main xloadl window is split into 3 parts. The top pane is called "Jobs", and shows each job submitted to LoadLeveler (the content is identical to the output of llq command). The middle pane is called "Machines", and shows each machine that LoadLeveler is aware of. The bottom pane is called "Messages", and shows responses of LoadLeveler to various actions that you tell it to perform. Each pane, except "Messages", has its own pull-down menu.

The "Jobs" pane is most commonly used by regular users. It allows to build and submit jobs, check their status, cancel unnecessary jobs, etc. As we go through the most common job-related functions of xloadl in the next few paragraphs, we will imply using the "Jobs" pane.

Let's use xloadl to build a job identical to the one we submitted in the previous subsection. To build a job, pull down the File menu, click Build a Job..., and select the type of job you want to submit. Valid choices here are serial (single-processor job) and parallel (multiprocessor job using Message Passing Interface(MPI) library). We will assume a single-processor job for now, thus choose serial here. This will create another window called "Build a Job". It contains a web-like submission form with a number of entries. Luckily, most of those entries do not need to be filled. The most important entries are

Other important entries, that you may need with more complex jobs, can be accessed if you press "Requirements" and "Limits" buttons located on a pane in the lower right corner. The most commonly used requirements are for memory and disk, specified in megabytes. The most useful limit is the wall clock limit, specified in seconds, it can be used to limit the job length to below what default class value is. This may allow the job to be picked for execution quicker, therefore it makes sense to set this limit if there is a way for you to estimate the length of the job. For this simple job you do not need to specify any additional requirements or limits.

To submit the job - press "Submit" button on the bottom of the "Build a Job" window. In the main LoadLeveler window you should see a message saying that the job has been submitted and what job ID it was assigned, or describing a reason why the job could not be submitted. If a job is submitted - you will be able to find it in the "Jobs" pane. Then you can further monitor the state of the job by using commands in the "Actions" menu.
You also have an option to construct a LoadLeveler command file from the information input into the "Build a Job" window by pressing "Save" button. This is a useful feature that you may use to construct command file templates as you become more experienced with LoadLeveler.

CCMST CPU accounts

Each user account is allowed to use one or more CCMST CPU accounts which are used to keep track of CPU usage. Each account has a certain quota. Once the quota is surpassed, the account cannot be used until more hours are added to the account. To check the status of a CCMST CPU account one may use command llacctinfo. Given CPU account name it will return account's quota and the number of CPU hours that has been charged to this account.

More Information on SSH

It is a Georgia Tech policy that all remote logins to campus servers are permitted only using secure shell clients. Secure shell protocol uses sophisticated encryption schemes to provide secure communication between machines. If you do not have a secure shell client installed on you machine, you may refer to the OIT page on ssh, from there you may obtain licensed copies of a Windows ssh client and find links to ssh implementations for MacOS, Linux, and other platforms. Solaris users may find OpenSSH binaries on Sunfreeware.com. We should also mention the FreeSSH website, which contains links to freely available implementations of ssh clients for a multitude of platforms.

Information on PBS for dgate

For users of the dgate cluster, we are using the PBS batch queue system. The basic commands are summarized below:

The structure of the command file is a little funny. PBS can read lines that start with #PBS. These are not commented out. A double comment character, ## will comment something out.... An example job control file is in ~evaleev/PBS/test.cmd. The minimal information is in the sample below:

# Sample PBS script
 
# How much memory to reserve
#PBS -l mem=200mb
 
# How much (CPU) time to request (determines queue).  The below 
# requests 3 hours, 10 minutes
#PBS -l cput=3:10:00
 
# Uncomment (i.e., delete one of the comment chars) this for sherrill 
# group I/O heavy jobs
##PBS -q sio

# How many nodes and processors you want.  For Sherrill group, we had 
# to hack the system and pretend that each node, which has 2 processors,
# has 5 "virtual processors".  Queue sio (above) defaults to ppn=3,
# which will prevent another I/O job from running on the same queue.
# For Sherrill group, double the number of processors you really want,
# and add one if you need I/O.  Note: ppn will default to 3 if you uncomment
# the PBS -q sio line above.  If you use the following line, you will
# override that ppn.  So, usually use the line above for sio, or the
# one below, but not both.
##PBS -l nodes=1:ppn=2

# This tells PBS what directory to go to
cd $PBS_O_WORKDIR

# Then just put the command below.  For example,
runqchem h2o.in h2o.out

Notwithstanding any language to the contrary, nothing contained herein constitutes nor is intended to constitute an offer, inducement, promise, or contract of any kind. The data contained herein is for informational purposes only and is not represented to be error free. Any links to non-Georgia Tech information are provided as a courtesy. They are not intended to nor do they constitute an endorsement by the Georgia Institute of Technology of the linked materials.
IBM, RS/6000, SP are registered trademarks of the IBM Corporation. All other trademarks are property of their respective owners.
Questions, suggestions, comments? E-mail the Webmaster.
Center for Computational Molecular Science and Technology
Georgia Institute of Technology
Last Modified: October 20, 2005