The queueing system on grommet is designed to maximize machine job throughput while keeping the load at a reasonable level. It tries to balance the usage among the different groups on grommet to help ensure that everyone's job gets a chance to run. The system is not perfect. Work is ongoing to improve the "fairness" of the job scheduler.
Details of the queueing system setup and instructions on running jobs in the queue can be found below.
If you have problems getting your script to work, let me know (pmcmahon"at"udel.edu) and we'll figure out what's going wrong.
This document contains sample qsub scripts. I hope to gather sample scripts for each of the major packages we run on grommet (MSI software, gaussian, jaguar, molpro, amber, etc.). If you have a working script to contribute, let me know.
There also links to additional documentation (beyond the man pages) and descriptions of some of the common (and possibly useful) qsub options.
This page is a work in progress. If there are any glaring omissions on this page, please let me know and I will try to fill in the gaps.
queues on grommet | description of the available queues on grommet and how jobs are scheduled to run |
running jobs with PBS | how to run a job on grommet through the queueing system |
command summary | brief description and usage of several useful PBS commands |
qsub on grommet | summary of the qsub options that will be most useful to you on grommet |
anatomy of a PBS script | details of the structure of a script to use with qsub and what to put in each section |
sample scripts | a collection of sample scripts for running various programs on grommet |
PBS Commands | links to the man pages and a little more |
Using PBS at NAS | PBS was developed at NASA's Numerical Aerospace Simulation facility. This is a link to their site's documentation on using PBS. It contains a lot of detailed information about using PBS. Go poke around if you're interested. |
External Reference Specification | PDF version of the PBS External Reference Specification document. Has
lots of info about how the queueing system works and how to use it. (Click here to get only the parts that it recommends the end users read). |
Queues and Job Scheduling on Grommet
Currently there are three queues on grommet:
Description: the default queue Command to submit a job: qsub myjob CPUtime limit: none Maximum # of processors: 2 per job (contact pmcmahon"at"udel.edu for jobs requiring more processors) Job Limits: 2 processors per user (1 2P job or 2 1P jobs); 6 processors per group
|
|
Description: For small jobs Command to submit a job: qsub -q fourhour myjob CPUtime limit: 4 hours Maximum # of processors: 2 per job Job Limits: 1 per user
|
|
Description: For short, debugging jobs Command to submit a job: qsub -q debug myjob CPUtime limit: 1 hour Maximum # of processors: 1 per job Job Limits: 1 per user
|
The queueing system attempts to keep the average background load on grommet
to around 12 (1.5 x the number of processors in the system). The total number
of cpus assigned to jobs running in the main and fourhour queues is limited
to 12. Jobs in the debug queue are not subject to this processor restriction.
Running jobs on grommet with PBS
qsub myjob
where myjob
is the name of the script you created to run
your job.
qsub
will return the jobid of your job if it is successfully
queued. The jobid will have the format:##.grommet.chem.udel.edu
qstat, qdel, qhold, qrls
).
qstat
command to monitor the progress of your job.
qstat -a
qstat -f jobid
optional arguments are enclosed in [ ].
Gamess: This script runs a sequential gamess job from the input file myjob.inp which is in the directory from which the qsub command was issued. The npri -w command preceeding the seqgms command causes the job to be run "weightless" in the background. The job will be checkpointed every 2 hours (-c c) so it can be restarted if the system goes down unexpectedly, and the standard output and error of the script will be returned in the jobname.o## file (-j oe).
#!/bin/tcsh -f
#PBS -c c -j oe
cd $PBS_O_WORKDIR
npri -w seqgms myjob
Gaussian 98: this script will run a 2 processor job (-l ncpus=2), checkpoint the job every 2 hours (-c c) and merge the standard error and standard output of the script into one file jobname.o## (-j oe). The Gaussian input file is myjob.com in the direcory where qsub was called (i.e. where the script is) and the output from g98 (and standard error from g98) is directed to myjob.out. For a single processor job, leave out the -l ncpus=2, or change the number for a job requiring more than 2 processors.
This script works for csh and tcsh users:
#!/bin/tcsh -f
#PBS -c c -j oe -l ncpus=2
cd $PBS_O_WORKDIR
setg98
npri -w g98 < myjob.com >& myjob.out
This script should work for ksh users:
#!/bin/ksh -f
#PBS -c c -j oe -l ncpus=2
cd $PBS_O_WORKDIR
g98root=/usr/programs
. $g98root/g98/bsd/g98.profile
npri -w g98 < myjob.com >& myjob.out
This script (for csh and tcsh users) shows the use of the short form of the gaussian submission command. The short form of the command can also be used in a script formatted for ksh. This form assumes that your input file is called myjob.com and will create a gaussian output file called myjob.log.:
#!/bin/tcsh -f
#PBS -c c -j oe -l ncpus=2
cd $PBS_O_WORKDIR
setg98
npri -w g98 myjob.com
Jaguar: This script runs a jaguar job from the jaguar input file myjob.in which is in the directory from which the qsub command was issued. The -w command on the jaguar run command causes jaguar to execute the job in the foreground rather than in the background. Schrodinger (makers of Jaguar) claim that the -w command will have the same effect on all jaguar commands (jaguar run, jaguar batch, etc.). Again the job will be checkpointed every 2 hours (-c c) so it can be restarted if the system goes down unexpectedly, and the standard output and error of the script will be returned in the jobname.o## file (-j oe).
#!/bin/tcsh -f
#PBS -c c -j oe
cd $PBS_O_WORKDIR
npri -w jaguar run -w myjob
All About Grommet... |
Still to come... |
---|---|
|
This page is maintained by Patrick McMahon.
Last Updated: 12 July 2008
The URL for this page is: http://www.udel.edu/chem/grommet/help/queues.html