Job Management and Workflows

Storage

All MCSR systems share home directories. Home directories are stored on a DDN network-attached storage appliance. It is attached to MCSR systems via Infiniband.

The storage appliance is in a high-performance, high-redundancy RAID configuration. All user data is also backed up.

Quotas

Home directories are subject to a 600GB quota. Temporary excursions to as much as 800GB are tolerated for up to seven days. Quotas can be increased to 1TB upon request. Storage needs above 1TB will require funding.

Temporary Files

All systems also share a /scratch area for large temporary files. Each user has their own directory inside the /scratch area. This area is stored on the same device as the home directories. Users should cleanup after themselves in their PBS scripts. This area is NOT backed up, and is subject to being purged as needed. It is subject to a separate 600GB quota.

All nodes have /tmp directories backed by on-node disks for temporary files, but they vary greatly in size. When creating files inside /tmp, please create a directory structure that looks like /tmp/$USER.$PBS_JOBID. This can be done using the following command:
mkdir -p /tmp/$USER.$PBS_JOBID

This will allow our scripts to easily clean up after failed jobs.

/ptmp Directories

In the past, all MCSR users had a /ptmp directory in addition to their home directory for large files. During a 2017 upgrade these directories were combined. Existing users had their ptmp directories copied into their home directory and a symbolic link was setup in /ptmp so that existing scripts would continue to work. Users created since the upgrade have no trace of a ptmp directory.

Permissions

Mostly for historical reasons, our default settings allow users to see each other’s files. We have considered changing this in the past and have decided against it. If this is an issue for you, you should change the permissions on your existing files and change your default umask so that new files will be created with stricter permissions. Contact MCSR staff for assistance.

Running Jobs

All of our supercomputers use a system called PBS to make sure that everyone’s program have the resources they need (mostly CPU cores and memory) and that they’re not taking more than their fair share.

Anything that runs for more than a few seconds should be run inside a PBS job. This can be done by using a PBS script or an interactive PBS session.

Example PBS jobs for many of our programs can be found in /usr/local/apps/example_jobs. We have a tutorial on running an example job.

PBS scripts are submitted by using the qsub command. The status of PBS jobs can then be checked by using the qstat command.

When a job is submitted, PBS checks to see if the resources required by the job are available. If so, the job is allowed to run. If not, the job is queued until the resources are available. If your job requires a lot of resources, it’s possible that jobs submitted after yours will run first because they require fewer resources.

It’s important to give some thought to how you break up your jobs and how many resources your job really needs. In general, several small jobs will usually start running sooner and finish faster than a single large job. A job might take less time with 32 CPU cores, but it could start running sooner and possibly finish sooner if you use only 16. It’s often worthwhile to see what resources are available and how many other jobs are queued before submitting your job.

Time

All jobs should specify walltime and CPU time (cput). Walltime is simply how long a job is expected to run. CPU time is how much time will all the various CPUs spend on this job. For a single CPU job, the two should be the same, but for a multi-CPU job, cput will be roughly n x walltime, where n is the number of processors. Jobs that exceed the requested walltime or cput will be killed by PBS.

#PBS -l walltime=8:00:00
This tells PBS how long you expect the job to run from the time it starts to the time it is finished. This example specifies 8 hours. If you job runs longer than specified, it will be killed by PBS.

#PBS -l cput=16:00:00

This tells PBS how much “CPU time” is needed by the job. For a single CPU job, it would be the same as walltime. For a multi-CPU job, it would be roughly n x walltime. If a job accumulates more cput than specified, it will be killed by PBS.

Email Notifications

To have PBS email you when your job begins, end, or aborts, add the following to your PBS script:

#PBS -m bea

By default, these emails are sent to your user’s local email. Unless you have this email forwarded, you probably want to specify an alternative email address like so:

#PBS -M user@host.edu

Multiple email addresses can be specified by separating them with commas.

Job Output Files

By default, all PBS jobs result in two output files being created by the PBS system. One captures the standard output of the job, the other captures standard error. They are named jobname.ojobid and jobname.ejobid, by default. These two files can be joined into a single file by adding:

#PBS -j oe

to your PBS script. In general, this is a good idea. The -o and -e options can also be used to rename the files.

Also by default, these files are written on the compute node and are not visible on the head node until after the job finishes. However, by adding:
#PBS -k oed
the files will be written directly to their final location and can be viewed while the job is still running. Note that this option is only available on Maple.

Job Environment

It’s important to note that jobs receive a default environment. Modules will need to be loaded and environmental variables will need to be set inside your PBS script. Also, in the case of the clusters, note that jobs are run on the compute nodes, and the software available there may be subtly different than that of the head node.

See the Maple page for information on allocating GPUs.

Interactive Jobs

When running a one-off job or developing a PBS script, it is convenient to be able to run commands directly inside the PBS environment.

Interactive PBS sessions can be started using qsub with the -I option:
qsub -I

By default, interactive sessions are allocated a single CPU and an amount of RAM that varies by system (900MB on sequoia and 1GB on catalpa).

If you need more resources than than the default, you can request them. For instance, to request four CPUs and five GB of RAM, you would run:
qsub -I -lncpus=4 -lmem=5gb

Once you have started an interactive job, PBS will place you on a node with the required resources. You may then run your program as you normally would.

If you program uses more resources than it requested, PBS will kill your job to protect other users’ well-behaving jobs.

When you’re done with your interactive job, exit your shell as you normally would to end the job and return to the head node. With the bash shell, exit or Control-D will work.

Running an example job

Example PBS jobs for many of our programs can be found in /usr/local/apps/example_jobs. This page describes how to run one of those jobs in particular, but could be applied to any of them.

To get a list of all example jobs available on a system, run:
ls /usr/local/apps/example_jobs

If you find that an example job you need is not available, please contact the MCSR staff. We are always happy to help users get their projects going.

Run the command below to copy the example Python job to your home directory. Substitute the name of the system you are on for “maple.”
cp -r /usr/local/apps/example_jobs/python_maple_example/ ~

Change into the directory that you just copied over:
cd python_maple_example/

Have a quick look at the PBS script that you’re about to use:
cat prime.pbs
For an explanation of the PBS script, see Example PBS Script.

Then submit the PBS script:
qsub prime.pbs

When you submit the job, PBS checks to see if the resources required for the job are available. If they are, your job is executed on a compute node.

You should receive a job ID number from the qsub command that will allow you to check the status of your job:
qstat your_number_goes_here

bnp@maple:~/python_maple_example> qstat 284861Job id            Name             User              Time Use S Queue----------------  ---------------- ----------------  -------- - -----284861.maple      prime            bnp               00:00:11 R workq

If you need more information about your job, use the -f option:
qstat -f your_number_goes_here

When qstat returns “Job has finished, use -x or -H to obtain historical job information,” you know that your job has finished. Run:
ls
to look for output files. Output filenames generally look like job_name.ojob_id. For instance, the name of this job is prime (the name is defined in the PBS script). When I ran this example, my  job ID was 3220, so my output filename was prime.o3220. The output file is a simple text file, so cat can be used to view it:
cat prime.o3220

Don’t forget to substitute your job ID.

Some jobs also produce an error file, with an e instead of an o. This will contain any errors produced by the job. Some jobs (like the Python job above) combine standard output and error output into one file.

Example PBS Script

This page explains the PBS script used on the Running an example job page. Lines of the script are in a fixed width font and bold. Explanation is in a variable width font.

The PBS script is essentially split into two parts. “Administrative” information begins with #PBS. This is the first half of the script. The second half of the script is a bash script (though it could be tcsh, Python, etc.). It contains the commands that are actually run.

#!/bin/bash
This line is not necessary for our purposes, but could be useful if running the script outside PBS.

#PBS -N prime
This line defines the name of the PBS job.

#PBS -j oe
This line tells PBS to combine the output and error output streams into a single file. This is more useful and convenient than the default.

#PBS -l ncpus=1
This tells PBS that a single CPU core is needed. This is the default. Specifying more than one CPU doesn’t automatically make your program use more than one CPU. There is usually some application-specific way of telling your program how many CPUs to use.

#PBS -l mem=1gb
This tells PBS how much memory is required for the job. You should slightly overestimate, as jobs using more memory than specified will be killed automatically by PBS.

#PBS -l walltime=1:00:00
This tells PBS how long you expect the job to run. This example specifies one hour. If your job runs longer than specified, it will be killed by PBS.

#PBS -l cput=1:00:00
This tells PBS how much “CPU time” is needed by the job. For a single CPU job, it would be the same as walltime. For a multi-CPU job, it would be roughly n x walltime. If a job accumulates more cput than specified, it will be killed by PBS.

#PBS -m abe
This tells PBS to email you when the job aborts, begins, or ends.

#PBS -M user@email.com
This line specifies which email the emails above should be sent to. By default they are sent to the user’s system mailbox, which is not convenient for most users.

module load python
This begins the list of commands needed to actually run the job. This line loads the system’s Python module, which is an installation of Anaconda Python. Many other modules are available. Most commonly, modules simply modify your PATH or other environmental variables.

cd ${PBS_O_WORKDIR}
This line is needed for almost every PBS script. When a job begins its execution, its current directory is the user’s home directory. This line changes the current directory to the directory from which the job was submitted. It utilizes an environmental variable that is set by PBS. There are several other environmental variables set by PBS, all of which can be seen in the job’s qstat -f output.

time ./prime.py
This line runs the Python script in the current directory. It also uses the bash shell’s time utility to keep up with how long it takes to run the Python script.

That’s it. If you need assistance modifying an example PBS script or writing one from scratch, please contact the MCSR staff.

Using PBS Variables

Many PBS jobs are very similar to each other, differing from previous jobs only by one or two parameters, like input filename. PBS allows scripts to use input parameters so scripts don’t have to be rewritten for every job.

An example job that demonstrates this is available on Maple at /usr/local/apps/example_jobs/pbs_variable_example

The main file is variable_example.pbs:
#PBS -N variable
#PBS -l mem=2gb
#PBS -l ncpus=1

cd $PBS_O_WORKDIR

echo "Input file: $INPUT_FILE"

# Run program here
ls -l $INPUT_FILE

The variable in this example is called INPUT_FILE. First, the script changes to the directory from which the script was submitted. Then it echos the value of the variable passed into the script. Finally, it runs a program on the input file, in this case, just ls.

This PBS script can be submitted by running:
qsub -v INPUT_FILE=file.csv variable_example.pbs

Starting multiple jobs at once

With some shell scripting, multiple jobs can be started at once. For instance, assume you had multiple CSV files that all needed to be processed by the same program. You could submit a separate for for each one by running:

for file in *.csv
do
qsub -vINPUT_FILE=$file variable_example.pbs
done

This code is included as launch_jobs.sh in the example job on Maple.

Running Graphical Applications

We do not recommend running graphical applications. They have to be run interactively, and a human must be around to logout when the job is finished. However, we realize some users need to run graphical applications.

UNIX systems use the X Window System for graphical applications. In order to forward graphics to your local screen, you’ll need an X Window client installed. Linux distributions should already have one. Mac OS X users should download and install XQuartz.

Please follow the guide below to run your graphical application:
ssh -X hpcwoods (Or otherwise connect to hpcwoods with X Forwarding enabled)
ssh -X maple
qsub -IX -lncpus=1 -lmem=5gb

Note that the resources requested in the last command should be tailored to your application and the data you’re running through your application. The more resources you request, the more careful you should be about stopping the interactive job when not actively using it.

You can easily test X forwarding with the xeyes or xclock command.