PBS Queues on Grommet


The queueing system on grommet is designed to maximize machine job throughput while keeping the load at a reasonable level. It tries to balance the usage among the different groups on grommet to help ensure that everyone's job gets a chance to run. The system is not perfect. Work is ongoing to improve the "fairness" of the job scheduler.

Details of the queueing system setup and instructions on running jobs in the queue can be found below.

If you have problems getting your script to work, let me know (pmcmahon"at"udel.edu) and we'll figure out what's going wrong.

This document contains sample qsub scripts. I hope to gather sample scripts for each of the major packages we run on grommet (MSI software, gaussian, jaguar, molpro, amber, etc.). If you have a working script to contribute, let me know.

There also links to additional documentation (beyond the man pages) and descriptions of some of the common (and possibly useful) qsub options.

This page is a work in progress. If there are any glaring omissions on this page, please let me know and I will try to fill in the gaps.



queues on grommet description of the available queues on grommet and how jobs are scheduled to run
running jobs with PBS how to run a job on grommet through the queueing system
command summary brief description and usage of several useful PBS commands
qsub on grommet summary of the qsub options that will be most useful to you on grommet
anatomy of a PBS script details of the structure of a script to use with qsub and what to put in each section
sample scripts a collection of sample scripts for running various programs on grommet
PBS Commands links to the man pages and a little more
Using PBS at NAS PBS was developed at NASA's Numerical Aerospace Simulation facility. This is a link to their site's documentation on using PBS. It contains a lot of detailed information about using PBS. Go poke around if you're interested.
External Reference Specification PDF version of the PBS External Reference Specification document. Has lots of info about how the queueing system works and how to use it.

(Click here to get only the parts that it recommends the end users read).

 


Queues and Job Scheduling on Grommet

Currently there are three queues on grommet:

main
Description: the default queue
Command to submit a job: qsub myjob
CPUtime limit: none
Maximum # of processors: 2 per job (contact pmcmahon"at"udel.edu for jobs requiring more processors)
Job Limits: 2 processors per user (1 2P job or 2 1P jobs); 6 processors per group
  • Job scripts in this queue MUST execute programs using the npri -w priority flag (see sample scripts below)
fourhour
Description: For small jobs
Command to submit a job: qsub -q fourhour myjob
CPUtime limit: 4 hours
Maximum # of processors: 2 per job
Job Limits: 1 per user
  • Job scripts in this queue MUST execute programs using the npri -w priority flag (see sample scripts below)
  • Jobs in this queue have priority over jobs in the main queue.
debug
Description: For short, debugging jobs
Command to submit a job: qsub -q debug myjob
CPUtime limit: 1 hour
Maximum # of processors: 1 per job
Job Limits: 1 per user
  • Jobs in this queue run immediateky when submitted (within the 1 job per user limit)
  • Jobs scripts in this queue may execute programs WITHOUT the npri -w priority flag


The queueing system attempts to keep the average background load on grommet to around 12 (1.5 x the number of processors in the system). The total number of cpus assigned to jobs running in the main and fourhour queues is limited to 12. Jobs in the debug queue are not subject to this processor restriction.


Running jobs on grommet with PBS

  1. Use your favorite editor (vi, emacs, nedit, jot) to create a shell script (text file) containing the commands needed to run your job. For many programs these will be the same commands you would use to run it at the command line. For some programs, the commands may be slightly different. Some experimentation may be necessary to work it out.

  2. Submit the job to the default queue using the qsub command:

    qsub myjob

    where myjob is the name of the script you created to run your job.

  3. qsub will return the jobid of your job if it is successfully queued. The jobid will have the format:

    ##.grommet.chem.udel.edu

    where ## is a non-negative integer. You can use either the full jobid or just the initial ## to refer to your job in any PBS command that takes a jobid as input (qstat, qdel, qhold, qrls).

  4. Use the qstat command to monitor the progress of your job.

    qstat -a

    shows a brief listing of all the jobs in the queue.

    qstat -f jobid

    shows more detailed information about the job specified.

Command Summary

optional arguments are enclosed in [ ].


Useful qsub options

 


Anatomy of a PBS Script

 


Sample Scripts

Gamess: This script runs a sequential gamess job from the input file myjob.inp which is in the directory from which the qsub command was issued. The npri -w command preceeding the seqgms command causes the job to be run "weightless" in the background. The job will be checkpointed every 2 hours (-c c) so it can be restarted if the system goes down unexpectedly, and the standard output and error of the script will be returned in the jobname.o## file (-j oe).

#!/bin/tcsh -f
#PBS -c c -j oe
cd $PBS_O_WORKDIR
npri -w seqgms myjob

Gaussian 98: this script will run a 2 processor job (-l ncpus=2), checkpoint the job every 2 hours (-c c) and merge the standard error and standard output of the script into one file jobname.o## (-j oe). The Gaussian input file is myjob.com in the direcory where qsub was called (i.e. where the script is) and the output from g98 (and standard error from g98) is directed to myjob.out. For a single processor job, leave out the -l ncpus=2, or change the number for a job requiring more than 2 processors.

This script works for csh and tcsh users:

#!/bin/tcsh -f
#PBS -c c -j oe -l ncpus=2
cd $PBS_O_WORKDIR
setg98
npri -w g98 < myjob.com >& myjob.out

This script should work for ksh users:

#!/bin/ksh -f
#PBS -c c -j oe -l ncpus=2
cd $PBS_O_WORKDIR
g98root=/usr/programs
. $g98root/g98/bsd/g98.profile
npri -w g98 < myjob.com >& myjob.out

This script (for csh and tcsh users) shows the use of the short form of the gaussian submission command. The short form of the command can also be used in a script formatted for ksh. This form assumes that your input file is called myjob.com and will create a gaussian output file called myjob.log.:

#!/bin/tcsh -f
#PBS -c c -j oe -l ncpus=2
cd $PBS_O_WORKDIR
setg98
npri -w g98 myjob.com

Jaguar: This script runs a jaguar job from the jaguar input file myjob.in which is in the directory from which the qsub command was issued. The -w command on the jaguar run command causes jaguar to execute the job in the foreground rather than in the background. Schrodinger (makers of Jaguar) claim that the -w command will have the same effect on all jaguar commands (jaguar run, jaguar batch, etc.). Again the job will be checkpointed every 2 hours (-c c) so it can be restarted if the system goes down unexpectedly, and the standard output and error of the script will be returned in the jobname.o## file (-j oe).

#!/bin/tcsh -f
#PBS -c c -j oe
cd $PBS_O_WORKDIR
npri -w jaguar run -w myjob


All About Grommet...

Still to come...

  • Interesting Links for Grommet Users and others


This page is maintained by Patrick McMahon.
Last Updated: 12 July 2008
The URL for this page is: http://www.udel.edu/chem/grommet/help/queues.html