Intro
To schedule and manage jobs on the Supek computer cluster, PBS Pro (Portable Batch System Professional) is used, which schedules jobs within the cluster. Its primary task is the distribution of computer tasks, i.e. batch jobs, among the available computer resources.
This document describes the use of PBS Pro 2022.1.1 version.
Job running
User applications (hereinafter jobs) that are started using the PBS system must be described by a start shell script (sh, bash, zsh...). Within the startup script above the normal commands, the PBS parameters are listed. These parameters can also be specified when submitting a job.
Basic job run:
qsub my_job.pbs
Job run with parameters:
qsub -q cpu -l ncpus=4:mem=10GB moj_posao.pbs
More info for qsub parameters:
qsub --help
After submitting the job, it is possible to view the standard output and error of the job that is in execution state with the commands:
qcat jobID qcat -e jobID qtail jobID qtail -e jobID
Job submitting
There are several ways jobs can be submitted:
- by interactive submission
- using a script
- in an interactive session
- job queue
in the case of interactive submission, directly calling the qsub command will open a text editor in the terminal, through which the commands for execution are submitted:
# run qsub [korisnik@x3000c0s25b0n0:~] $ qsub Job script will be read from standard input. Submit with CTRL+D. echo "Hello world" 14571.x3000c0s25b0n0.hsn.hpc.srce.hr # print directory content [korisnik@x3000c0s25b0n0:~] $ ls -l total 5140716 -rw------- 1 korisnik hpc 0 Jun 1 07:44 STDIN.e14571 -rw------- 1 korisnik hpc 12 Jun 1 07:44 STDIN.o14571 # print output file content [korisnik@x3000c0s25b0n0:~] $ cat STDIN.o14571 Hello world
In the case of script submission, we can specify the commands to be executed in the input file that we submit:
# print file hello.sh [korisnik@x3000c0s25b0n0:~] $ cat hello.sh #!/bin/bash #PBS -N hello echo "Hello world" # submit job script [korisnik@x3000c0s25b0n0:~] $ qsub hello.sh 14572.x3000c0s25b0n0.hsn.hpc.srce.hr # print directory content [korisnik@x3000c0s25b0n0:~] $ ls -l total 5140721 -rw------- 1 korisnik hpc 0 Jun 1 07:44 STDIN.e14571 -rw------- 1 korisnik hpc 12 Jun 1 07:44 STDIN.o14571 -rw------- 1 korisnik hpc 0 Jun 1 08:02 hello.e14572 -rw------- 1 korisnik hpc 12 Jun 1 08:02 hello.o14572 -rw-r--r-- 1 korisnik hpc 46 Jun 1 07:55 hello.sh # print output file content [korisnik@x3000c0s25b0n0:~] $ cat hello.o14572 Hello world
In the case of an interactive session, using the qsub -I option without an input script will open a terminal on the main working node within which we can run commands:
# hostname on access node [korisnik@x3000c0s25b0n0:~] $ hostname x3000c0s25b0n0 # interactive session [korisnik@x3000c0s25b0n0:~] $ qsub -I -N hello-interactive qsub: waiting for job 14574.x3000c0s25b0n0.hsn.hpc.srce.hr to start qsub: job 14574.x3000c0s25b0n0.hsn.hpc.srce.hr ready # hostname on working node [korisnik@x8000c0s3b0n0:~] $ hostname x8000c0s3b0n0
In the case of an array of jobs, using the qsub -J X-Y[:Z] option we can submit a given number of identical jobs in the range X to Y with step Z:
# submit job array [korisnik@x3000c0s25b0n0:~] $ qsub -J 1-10:2 hello.sh 14575[].x3000c0s25b0n0.hsn.hpc.srce.hr # print directory content [korisnik@x3000c0s25b0n0:~] $ ls -l total 5140744 -rw------- 1 korisnik hpc 0 Jun 1 07:44 STDIN.e14571 -rw------- 1 korisnik hpc 12 Jun 1 07:44 STDIN.o14571 -rw------- 1 korisnik hpc 0 Jun 1 08:02 hello.e14572 -rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.1 -rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.3 -rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.5 -rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.7 -rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.9 -rw------- 1 korisnik hpc 12 Jun 1 08:02 hello.o14572 -rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.1 -rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.3 -rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.5 -rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.7 -rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.9 -rw-r--r-- 1 korisnik hpc 46 Jun 1 07:55 hello.sh
Job Array
This method is preferred over multiple submissions (e.g. with a for loop) because:
- reduces job queue load - each job will compete for resources simultaneously with everyone else in the queue, instead of one after the other
- easier management - modification of all jobs is possible by calling the main (e.g. 14575[]) or individual (e.g. 14575[3]) job identifier
The environment variables defined by PBS during their execution are:
- PBS_ARRAY_INDEX - ordinal number of sub-jobs in the job field (e.g. one to nine in the example above)
- PBS_ARRAY_ID - identifier of the main job field
- PBS_JOBID - subjob identifier in the job field