http://www.eastchem.ac.uk http://www.eastchem.ac.uk/rcf http://www.st-andrews.ac.uk http://www.ed.ac.uk


Support Pages | Topics


Queues | Submitting Jobs | Using the Grid | Script Syntax

Job Submission

This page contains information on submitting jobs to the EaStCHEM RCF clusters and compute grid.

The queuing system on the clusters is the Sun Grid Engine (SGE). This system tries to allow fair access to all users to the compute resources. Although the syntax for SGE can take a little getting used to we have put lots of tools together to help you along.

Queues

At the centre of the job submission system are the queues, you will almost always be submitting jobs to a particular queue. The queue names are designed to be mnemonics for the type of job you want to submit. For example, on hare.epcc.ed.ac.uk there is a queue called parallel-short.q which you would use if you wanted to run a short (less than 3 hours) parallel calculation. A list of the queues and there purpose is given in the table below. The descriptions can also be accessed on the clusters using the qlist command, e.g.

> qlist

hare.epcc.ed.ac.uk

burke.st-andrews.ac.uk

fat-medium.q

8-processor SMP, 1 week limit

fat.q

8-processor SMP, no time limit

fat-long.q

8-processor SMP, 4 week limit

parallel-short.q

Up to 24 processors, 3 hour limit

parallel-short.q

Up to 24 processors, 3 hour limit

parallel-medium.q

Up to 24 processors, 1 week limit

parallel-medium.q

Up to 24 processors, 1 week limit

parallel-long.q

Up to 24 processors, 4 week limit

parallel-long.q

Up to 24 processors, no time limit

serial-medium.q

Serial jobs, 1 week limit

serial-medium.q

Serial jobs, 1 week limit

serial-long.q

Serial jobs, 4 week limit

serial-long.q

Serial jobs, no time limit

test-run.q

Serial jobs, 3 hour limit

Submitting Jobs

How you submit jobs to the queues depends on the software you want to use.

If you are using one of the standard software packages (Gaussian 03, GAMESS-UK, MOLPRO, CASTEP, CPMD, CRYSTAL, DL_POLY, Amber) then you should be able to use the job submission tools designed specifically for that piece of software. For example, to submit Gaussian 03 jobs you would use the g03sub command. A guide to these tools can be found in the EaStCHEM Software Submission Package (SSP) User Guide.

If you are using your own software then you will probably have to write your own job submission script. However, there are tools that will generate a template for you so you do not have to start from scratch (see the EaStCHEM SSP User Guide.) A brief explanation of submission script syntax can be found below. If you have any specific questions, please post them to the forum or contact your RCO.

Using the EaStCHEM RCF Compute Grid

If you have access, you can also submit jobs to other clusters within the EaStCHEM RCF Compute Grid. Information on setting up your account to use the grid and managing grid jobs can be found in the How To setup you account to access the EaStCHEM Grid. As for local job submission, the procedure for submitting jobs depends on the software you are using.

For one of the standard software packages, the same submission tools that are used for local job submission can be used. The guide to these tools can be found in the EaStCHEM SSP User Guide.

For jobs using your own software, you will have to write your own scripts. Tools are available to generate templates. For more information on writing scripts please see the How to write EaStCHEM Grid job submission scripts and for information on generating templates see the EaStCHEM SSP User Guide.

To submit jobs to remote resources you will need to know what queues are available on the system you are submitting the job to. You can use the qlist command in the following way:

> qlist host.domain.com 

Where host.domain.com is the hostname of the machine you want to list the queues for. You can also get a list of the standard codes available on a remote system (and the location of the executables) using the clist command thus:

> clist host.domain.com

Submission Script Syntax

This section gives an example of how to generate a submission script template and then goes through the meaning of the script line-by-line. Imagine we used the following command:

me@local> cpmdsub -norun -np 8 -q parallel-short.q h2-wave.inp

(For more information on the cpmdsub command see the EaStCHEM SSP User Guide.) The job submission script produced (in h2-wave.inp.bash) would look like:

#!/bin/bash

#$ -cwd -V
#$ -q parallel-short.q
#$ -N h2-wave.inp_cpm
#$ -A rcf
#$ -pe mpich 2
#$ -R y
#$ -l h_rt=03:00:00

cat $HOME/.mpich/mpich_hosts.$JOB_ID | cut -f 1 -d . | sort | fmt -w 30
sed s/$/:4/ $HOME/.mpich/mpich_hosts.$JOB_ID > $HOME/.mpich/ndfile.$JOB_ID

mpirun -np 8 -machinefile $HOME/.mpich/ndfile.$JOB_ID /usr/local/CPMD-3.11.1/BIN/cpmd_mpi.x h2-wave.inp

We will now go through this script to describe what the various lines mean.

The first line

#!/bin/bash

specifies the shell type to use for the script. In our case, this is always bash. This choice affects the format of commands that can be used in the script.

The next set of lines

#$ -cwd -V
#$ -q parallel-short.q
#$ -N h2-wave.inp_cpm
#$ -A rcf
#$ -pe mpich 2
#$ -R y
#$ -l h_rt=03:00:00

all begin with #$ and specify the options to the batch submission system (SGE in this case):

The next two lines

cat $HOME/.mpich/mpich_hosts.$JOB_ID | cut -f 1 -d . | sort | fmt -w 30
sed s/$/:4/ $HOME/.mpich/mpich_hosts.$JOB_ID > $HOME/.mpich/ndfile.$JOB_ID

are required to set up the parallel environment for running the calculation. They basically create a machine file with a list of the nodes used in the calculation so the parallel environment knows where to send the processes.

The final line

mpirun -np 8 -machinefile $HOME/.mpich/ndfile.$JOB_ID /usr/local/CPMD-3.11.1/BIN/cpmd_mpi.x h2-wave.inp

is the line that actually runs the program. As this is a DMP job it uses mpirun to run the CPMD program. You should also notice that it requests 8 processors; uses the machine file containing the list of compute nodes that was created in the previous two lines; specifies the location of the executable and references the input file specified in the cpmdsub command.

You can find more about the SGE syntax in the SGE User Guide and about the syntax for submitting grid jobs in the EaStCHEM SSP User Guide.

Table of SGE resource requests associated with each queue

Queue

Resources

hare.epcc.ed.ac.uk

test-run.q

-l h_rt=03:00:00

serial-medium.q

-l h_rt=168:00:00

serial-long.q

-l h_rt=672:00:00

parallel-short.q

-l h_rt=03:00:00

parallel-medium.q

-l h_rt=168:00:00

parallel-long.q

-l h_rt=672:00:00

fat-medium.q

-l h_rt=168:00:00 -l bigmem=true

fat-long.q

-l h_rt=672:00:00 -l bigmem=true

burke.st-andrews.ac.uk

serial-medium.q

serial-long.q

parallel-short.q

parallel-medium.q

parallel-long.q

fat.q

-l bigmem=true

ComputationalChemistryActivity/SupportPages/JobSubmission (last edited 2007-10-17 15:21:38 by AndrewTurner)