Intro
To schedule and manage jobs on the Supek computer cluster, PBS Pro (Portable Batch System Professional) is used, which schedules jobs within the cluster. Its primary task is the distribution of computer tasks, i.e. batch jobs, among the available computer resources.
This document describes the use of PBS Pro 2022.1.1 version.
Job running
User applications (hereinafter jobs) that are started using the PBS system must be described by a start shell script (sh, bash, zsh...). Within the startup script above the normal commands, the PBS parameters are listed. These parameters can also be specified when submitting a job.
...
Code Block |
---|
|
qsub my_job.pbs |
Job run with parameters:
Code Block |
---|
|
qsub -q cpu -l ncpus=4:mem=10GB moj_posao.pbs |
More info for qsub parameters:
After submitting the job, it is possible to view the standard output and error of the job that is in execution state with the commands:
Code Block |
---|
qcat jobID
qcat -e jobID
qtail jobID
qtail -e jobID |
Job submitting
There are s
everal ways jobs can be submitted:
- by interactive submission
- using a script
- in an interactive session
- job queue
in the case of interactive submission, directly calling the qsub command will open a text editor in the terminal, through which the commands for execution are submitted:
Code Block |
---|
|
# run qsub
[korisnik@x3000c0s25b0n0:~] $ qsub
Job script will be read from standard input. Submit with CTRL+D.
echo "Hello world"
14571.x3000c0s25b0n0.hsn.hpc.srce.hr
# print directory content
[korisnik@x3000c0s25b0n0:~] $ ls -l
total 5140716
-rw------- 1 korisnik hpc 0 Jun 1 07:44 STDIN.e14571
-rw------- 1 korisnik hpc 12 Jun 1 07:44 STDIN.o14571
# print output file content
[korisnik@x3000c0s25b0n0:~] $ cat STDIN.o14571
Hello world |
In the case of script submission, we can specify the commands to be executed in the input file that we submit:
Code Block |
---|
|
# print file hello.sh
[korisnik@x3000c0s25b0n0:~] $ cat hello.sh
#!/bin/bash
#PBS -N hello
echo "Hello world"
# submit job script
[korisnik@x3000c0s25b0n0:~] $ qsub hello.sh
14572.x3000c0s25b0n0.hsn.hpc.srce.hr
# print directory content
[korisnik@x3000c0s25b0n0:~] $ ls -l
total 5140721
-rw------- 1 korisnik hpc 0 Jun 1 07:44 STDIN.e14571
-rw------- 1 korisnik hpc 12 Jun 1 07:44 STDIN.o14571
-rw------- 1 korisnik hpc 0 Jun 1 08:02 hello.e14572
-rw------- 1 korisnik hpc 12 Jun 1 08:02 hello.o14572
-rw-r--r-- 1 korisnik hpc 46 Jun 1 07:55 hello.sh
# print output file content
[korisnik@x3000c0s25b0n0:~] $ cat hello.o14572
Hello world |
In the case of an interactive session, using the qsub -I option without an input script will open a terminal on the main working node within which we can run commands:
Code Block |
---|
|
# hostname on access node
[korisnik@x3000c0s25b0n0:~] $ hostname
x3000c0s25b0n0
# interactive session
[korisnik@x3000c0s25b0n0:~] $ qsub -I -N hello-interactive
qsub: waiting for job 14574.x3000c0s25b0n0.hsn.hpc.srce.hr to start
qsub: job 14574.x3000c0s25b0n0.hsn.hpc.srce.hr ready
# hostname on working node
[korisnik@x8000c0s3b0n0:~] $ hostname
x8000c0s3b0n0 |
In the case of an array of jobs, using the qsub -J X-Y[:Z] option we can submit a given number of identical jobs in the range X to Y with step Z:
Code Block |
---|
|
# submit job array
[korisnik@x3000c0s25b0n0:~] $ qsub -J 1-10:2 hello.sh
14575[].x3000c0s25b0n0.hsn.hpc.srce.hr
# print directory content
[korisnik@x3000c0s25b0n0:~] $ ls -l
total 5140744
-rw------- 1 korisnik hpc 0 Jun 1 07:44 STDIN.e14571
-rw------- 1 korisnik hpc 12 Jun 1 07:44 STDIN.o14571
-rw------- 1 korisnik hpc 0 Jun 1 08:02 hello.e14572
-rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.1
-rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.3
-rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.5
-rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.7
-rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.9
-rw------- 1 korisnik hpc 12 Jun 1 08:02 hello.o14572
-rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.1
-rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.3
-rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.5
-rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.7
-rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.9
-rw-r--r-- 1 korisnik hpc 46 Jun 1 07:55 hello.sh |
Tip |
---|
|
This method is preferred over multiple submissions (e.g. with a for loop) because: - reduces job queue load - each job will compete for resources simultaneously with everyone else in the queue, instead of one after the other
- easier management - modification of all jobs is possible by calling the main (e.g. 14575[]) or individual (e.g. 14575[3]) job identifier
The environment variables defined by PBS during their execution are: - PBS_ARRAY_INDEX - ordinal number of sub-jobs in the job field (e.g. one to nine in the example above)
- PBS_ARRAY_ID - identifier of the main job field
- PBS_JOBID - subjob identifier in the job field
|
Job Description
The PBS system language is used to describe jobs, while the job description file is a standard shell script. In the header of each script, PBS parameters are listed that describe the job in detail, followed by commands to execute the desired application.
Structure of the startup script:
Code Block |
---|
language | bash |
---|
title | my_job.pbs |
---|
|
#!/bin/bash
#PBS -P test_example
#PBS -q cpu
#PBS -e /home/my_directiry
#PBS -l select=2:ncpus=10
module load gcc/12.1.0
gcc --version |
Example of a startup script:
Code Block |
---|
language | bash |
---|
title | my_job.pbs |
---|
|
#!/bin/bash
#PBS -<parametar1> <value>
#PBS -<parametar2> <value>
<command> |
Basic PBS parameters
Option | Argument | Meaning |
-N | name | Naming the job |
-q | destination | Specifying job queue or node |
-l | list_of_resources | Amount of resources required for the job |
-M | list_of_users | List of users to receive e-mail |
-m | email_options | Types of mail notifications |
-o | path/to/directory | Path to directory for output file |
-e | path/to/directory | Path to directory for error file |
-j
| oe | Combining output and error file |
-Wgroup_list | project_code | Project code for job |
Options for sending notifications by mail with the -m option:
a | Mail is sent when the batch system terminates the job |
b | Mail is sent when the job starts executing |
e | Mail is sent when the job is done |
j | Mail is sent for sub jobs. Must be combined with one or more sub-options a, b or e |
Code Block |
---|
language | bash |
---|
title | Example mail |
---|
|
#!/bin/bash
#PBS -q cpu
#PBS -l select=1:ncpus=2
#PBS -M <name>@srce.hr,<name2>@srce.hr
#PBS -m be
echo $PBS_JOBNAME > out
echo $PBS_O_HOST |
Two emails were received:
Code Block |
---|
|
PBS Job Id: 2686.x3000c0s25b0n0.hsn.hpc.srce.hr
Job Name: pbs.pbs
Begun execution |
Code Block |
---|
|
PBS Job Id: 2686.x3000c0s25b0n0.hsn.hpc.srce.hr
Job Name: pbs.pbs
Execution terminated
Exit_status=0
resources_used.cpupercent=0
resources_used.cput=00:00:00
resources_used.mem=0kb
resources_used.ncpus=2
resources_used.vmem=0kb
resources_used.walltime=00:00:01 |
Options for resources with the -l option:
-l select=3:ncpus=2 | Option for 3 chunks of a node with 2 cores (6 cores in total) |
-l select=1:ncpus=10:mem=20GB | Option for 1 chunka of a node with 10 cores i 20GB RAM |
-l ngpus=2 | Option for 2 GPU-s |
PBS environmental variables
NCPUS | Number of cores requested. Matches the value from the ncpus option from the PBS script header. |
OMP_NUM_THREADS | An OpenMP variable exported by PBS to the environment that is equal to the value of the ncpus option from the PBS script header |
PBS_JOBID | Identifikator posla koji daje PBS kada se posao preda. Stvoreno nakon izvršenja naredbe qsub.
|
PBS_JOBNAME | Job identifier provided by PBS when a job is submitted. Created after executing the qsub command. |
PBS_NODEFILE | List of work nodes, or processor cores on which the job is executed |
PBS_O_WORKDIR | The working directory in which the job was submitted, or in which the qsub command was invoked. |
TMPDIR | The path to the scratch directory. |
Tip |
---|
title | Setting up working directory |
---|
|
While in PBS pro the path for output and error files is specified in the directory where they are executed, the input and output files of the program itself are by default loaded/saved in the $HOME directory. PBS Pro does not have the option of specifying the job to run in the current directory we are in, so it is necessary to manually change the directory. After the header it is necessary to write: cd $PBS_O_WORKDIR It will redirect the job execution to the directory where the script was run. |
Parallel jobs
OpenMP parallelization
If your application uses parallelization exclusively at the level of OpenMP threads and cannot expand beyond one working node (that is, it works with shared memory), you can call the job as shown in the xTB application example below.
Tip |
---|
OpenMP aplikacije zahtjevaju definiranje varijable OMP_NUM_THREADS . PBS sustav vodi računa o tome umjesto Vas, te joj pridružuje vrijednost varijable ncpus , definirane u zaglavlju PBS skripte. |
Code Block |
---|
|
#!/bin/bash
#PBS -q cpu
#PBS -l ncpus=8
cd ${PBS_O_WORKDIR}
xtb C2H4BrCl.xyz --chrg 0 --uhf 0 --opt vtight |
MPI parallelization
If your application can be parallelized hybridly, i.e. divide its MPI processes into OpenMP threads, you can call the job as shown in the GROMACS application example below:
Tip |
---|
OpenMP aplikacije zahtijevaju definiranje varijable OMP_NUM_THREADS . PBS sustav joj automatski pridružuje vrijednost varijable ncpus , definirane u zaglavlju PBS skripte. Vrijednost varijable select iz zaglavlja PBS skripte odgovara broju MPI procesa, međutim, nema pripadajuću varijablu koju PBS sustav izvodi u okolinu. Kako bi se izbjeglo prepisivanje, u primjeru niže, definirana je varijabla MPI_NUM_PROCESSES koja odgovara vrijednosti varijable select . |
Code Block |
---|
|
#!/bin/bash
#PBS -q cpu
#PBS -l select=8:ncpus=4
#PBS -l place=scatter
MPI_NUM_PROCESSES=$(cat ${PBS_NODEFILE} | wc -l)
cd ${PBS_O_WORKDIR}
mpiexec -n ${MPI_NUM_PROCESSES} --ppn 1 -d ${OMP_NUM_THREADS} --cpu-bind depth gmx mdrun -v -deffnm md |
cray-pals
Running applications using MPI parallelization (or hybrid MPI+OMP) requires the cray-pals module to be raised before calling the mpiexec command, thus ensuring proper integration of the application with the PBS Pro job submission system and Cray's version of the MPI mpiexec application based on the MPICH implementation.
An example of calling this module and executing a parallel application on two processors:
Code Block |
---|
|
#!/bin/bash
#PBS -l ncpus=2
module load cray-pals
mpiexec -np 2 moja_aplikacija_MPI |
The environment variables that the mpiexec command will set on each of the MPI ranks will be:
Environment variable | Description |
---|
PALS_RANKID | Total rank of the MPI process |
PALS_NODEID | Serial number of the local node (if the work is performed on several nodes) |
PALS_SPOOL_DIR | Temporary directory |
PALS_LOCAL_RANKID | Local ranking of the MPI process (if the work is performed on multiple nodes) |
PALS_APID | The unique identifier of the application you executed |
PALS_DEPTH | Number of processor cores per rank |
Note |
---|
Scientific applications on Supek and cray-pals Scientific applications that are available on Supek via the modulefiles tool already call this module, so it is not necessary to call it again. |
Monitoring and management of job execution
Job monitoring
The PBS command qstat is used to display the status of jobs. The basic command syntax is:
Code Block |
---|
qstat <options><job_ID> |
By executing the qstat command without additional options, a printout of all jobs of all users is obtained:
Code Block |
---|
Job id Name User Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
2663.x3000c0s25b* mpi+omp_s kmrkalj 00:36:09 R cpu |
Some of the more used options are:
-E | Groups jobs by server and displays jobs sorted by ascending ID. When qstat is displayed with a list of jobs, the jobs are grouped by server and each group is shown by ascending ID. This option also improves the performance of qstat. |
-t | Displays status information for jobs, job streams, and subjobs. |
-p | The display of the Time Used column is replaced by the percentage of work done. For a string job, this is the percentage of subjobs completed. For normal work, this is a percentage of the allocated CPU time used. |
-x | Displays status information for completed and moved jobs in addition to pending and running jobs. |
-Q | Shows queue status in standard format. |
-q | Displays queue status in an alternate format. |
-f | Displays job status in an alternate format |
Examples of use:
Detailed job description:
Code Block |
---|
qstat -fxw 2648 |
The tracejob command extracts and displays log messages for a PBS job in chronological order.
Code Block |
---|
tracejob <job_ID> |
Example:
Code Block |
---|
$ tracejob 2670
Job: 2670.x3000c0s25b0n0.hsn.hpc.srce.hr
03/30/2023 11:23:24 L Considering job to run
03/30/2023 11:23:24 S Job Queued at request of mhrzenja@x3000c0s25b0n0.hsn.hpc.srce.hr, owner =
mhrzenja@x3000c0s25b0n0.hsn.hpc.srce.hr, job name = mapping, queue = cpu
03/30/2023 11:23:24 S Job Run at request of Scheduler@x3000c0s25b0n0.hsn.hpc.srce.hr on exec_vnode
(x8000c0s0b0n0:ncpus=40:mem=104857600kb)
03/30/2023 11:23:24 L Job run
03/30/2023 11:23:24 S enqueuing into cpu, state Q hop 1
03/30/2023 11:23:56 S Holds u set at request of mhrzenja@x3000c0s25b0n0.hsn.hpc.srce.hr
03/30/2023 11:24:22 S Holds u released at request of mhrzenja@x3000c0s25b0n0.hsn.hpc.srce.hr |
Job managment
The job can also be managed after submitting.
While the job is in the queue, it is possible to temporarily stop its execution with the command:
Returning to the queue:
The job is completely stopped or unqueued with the command:
Force stop should be used for stuck jobs:
Code Block |
---|
qdel -W force -x <job_ID> |
Postponement of execution
PBS provides the feature to perform tasks in dependence on others, which is useful in cases such as:
- the execution of jobs depends on the output or state of the previously executed
- the application requires the sequential execution of various components
- printing data from one job may compromise the execution of another
The directive that enables this functionality on instant job submission is:
Code Block |
---|
qsub -W depend=<type>:<job_ID>[:<job_ID>] ... |
Where <type> can be:
- after* - starting the current one with respect to the others
- after - execution of the current one after the start of execution of the specified ones
- afterok - execution of the current after the successful completion of the specified
- afternotok - execution of the current after an error in the completion of the specified
- afterany - execution of the current one after the completion of the specified ones
- before* - starting others with respect to the current one
- before - starting the listed ones after the start of the current one
- beforeok - starting the above after the successful completion of the current one
- beforenotok - starting the above after an error in the execution of the current one
- beforeany - starting the listed ones after the end of the current one
- on:<number> - execution of a job that will depend on the subsequently specified number of before* type of jobs
Note |
---|
A job with the -W depend=... directive will not be submitted if the specified job IDs do not exist (or if they are not queued) |
Examples:
If we want job1 to start after the successful completion of job0:
Code Block |
---|
[korisnik@x3000c0s25b0n0] $ qsub job1
1000.x3000c0s25b0n0.hsn.hpc.srce.hr
[korisnik@x3000c0s25b0n0] $ qsub -W depend=afterok:1000 job1
1001.x3000c0s25b0n0.hsn.hpc.srce.hr
[korisnik@x3000c0s25b0n0] $ qstat 1000 1001
Job id Name User Time Use S Queue
--------------------- ---------------- ---------------- -------- - -----
1000.x3000c0s25b0n0 job0 korisnik 00:00:00 R cpu
1001.x3000c0s25b0n0 job1 korisnik 0 H cpu |
If we want job0 to start only after the successful completion of job1:
Code Block |
---|
[korisnik@x3000c0s25b0n0] $ qsub -W depend=on:1 job0
1002.x3000c0s25b0n0.hsn.hpc.srce.hr
[korisnik@x3000c0s25b0n0] $ qsub -W depend=beforeok:1002 job1
1003.x3000c0s25b0n0.hsn.hpc.srce.hr
[korisnik@x3000c0s25b0n0] $ qstat 1002 1003
Job id Name User Time Use S Queue
--------------------- ---------------- ---------------- -------- - -----
1002.x3000c0s25b0n0 job0 korisnik 0 H cpu
1003.x3000c0s25b0n0 job1 korisnik 00:00:00 R cpu |