You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »

Intro

To schedule and manage jobs on the Supek computer cluster, PBS Pro (Portable Batch System Professional) is used, which schedules jobs within the cluster. Its primary task is the distribution of computer tasks, i.e. batch jobs, among the available computer resources.

This document describes the use of PBS Pro 2022.1.1 version.


Job running

User applications (hereinafter jobs) that are started using the PBS system must be described by a start shell script (sh, bash, zsh...). Within the startup script above the normal commands, the PBS parameters are listed. These parameters can also be specified when submitting a job.

Basic job run:

qsub my_job.pbs

Job run with parameters:

qsub -q cpu -l ncpus=4:mem=10GB moj_posao.pbs

More info for qsub parameters:

qsub --help


After submitting the job, it is possible to view the standard output and error of the job that is in execution state with the commands:

qcat jobID
qcat -e jobID
qtail jobID
qtail -e jobID

Job submitting

There are s

everal ways jobs can be submitted:

  • by interactive submission
  • using a script
  • in an interactive session
  • job queue

in the case of interactive submission, directly calling the qsub command will open a text editor in the terminal, through which the commands for execution are submitted:

# run qsub
[korisnik@x3000c0s25b0n0:~] $ qsub
Job script will be read from standard input. Submit with CTRL+D.
echo "Hello world"
14571.x3000c0s25b0n0.hsn.hpc.srce.hr
 
# print directory content
[korisnik@x3000c0s25b0n0:~] $ ls -l
total 5140716
-rw-------  1 korisnik hpc          0 Jun  1 07:44 STDIN.e14571
-rw-------  1 korisnik hpc         12 Jun  1 07:44 STDIN.o14571
 
# print output file content
[korisnik@x3000c0s25b0n0:~] $ cat STDIN.o14571
Hello world

In the case of script submission, we can specify the commands to be executed in the input file that we submit:

# print file hello.sh
[korisnik@x3000c0s25b0n0:~] $ cat hello.sh
#!/bin/bash
 
#PBS -N hello
echo "Hello world"
 
# submit job script
[korisnik@x3000c0s25b0n0:~] $ qsub hello.sh
14572.x3000c0s25b0n0.hsn.hpc.srce.hr
 
# print directory content
[korisnik@x3000c0s25b0n0:~] $ ls -l
total 5140721
-rw-------  1 korisnik hpc          0 Jun  1 07:44 STDIN.e14571
-rw-------  1 korisnik hpc         12 Jun  1 07:44 STDIN.o14571
-rw-------  1 korisnik hpc          0 Jun  1 08:02 hello.e14572
-rw-------  1 korisnik hpc         12 Jun  1 08:02 hello.o14572
-rw-r--r--  1 korisnik hpc         46 Jun  1 07:55 hello.sh
 
# print output file content
[korisnik@x3000c0s25b0n0:~] $ cat hello.o14572
Hello world

In the case of an interactive session, using the qsub -I option without an input script will open a terminal on the main working node within which we can run commands:

# hostname on access node
[korisnik@x3000c0s25b0n0:~] $ hostname
x3000c0s25b0n0
 
# interactive session
[korisnik@x3000c0s25b0n0:~] $ qsub -I -N hello-interactive
qsub: waiting for job 14574.x3000c0s25b0n0.hsn.hpc.srce.hr to start
qsub: job 14574.x3000c0s25b0n0.hsn.hpc.srce.hr ready
 
# hostname on working node
[korisnik@x8000c0s3b0n0:~] $ hostname
x8000c0s3b0n0

In the case of an array of jobs, using the qsub -J X-Y[:Z] option we can submit a given number of identical jobs in the range X to Y with step Z:

# submit job array
[korisnik@x3000c0s25b0n0:~] $ qsub -J 1-10:2 hello.sh
14575[].x3000c0s25b0n0.hsn.hpc.srce.hr
 
# print directory content
[korisnik@x3000c0s25b0n0:~] $ ls -l
total 5140744
-rw-------  1 korisnik hpc          0 Jun  1 07:44 STDIN.e14571
-rw-------  1 korisnik hpc         12 Jun  1 07:44 STDIN.o14571
-rw-------  1 korisnik hpc          0 Jun  1 08:02 hello.e14572
-rw-------  1 korisnik hpc          0 Jun  1 08:21 hello.e14575.1
-rw-------  1 korisnik hpc          0 Jun  1 08:21 hello.e14575.3
-rw-------  1 korisnik hpc          0 Jun  1 08:21 hello.e14575.5
-rw-------  1 korisnik hpc          0 Jun  1 08:21 hello.e14575.7
-rw-------  1 korisnik hpc          0 Jun  1 08:21 hello.e14575.9
-rw-------  1 korisnik hpc         12 Jun  1 08:02 hello.o14572
-rw-------  1 korisnik hpc         12 Jun  1 08:21 hello.o14575.1
-rw-------  1 korisnik hpc         12 Jun  1 08:21 hello.o14575.3
-rw-------  1 korisnik hpc         12 Jun  1 08:21 hello.o14575.5
-rw-------  1 korisnik hpc         12 Jun  1 08:21 hello.o14575.7
-rw-------  1 korisnik hpc         12 Jun  1 08:21 hello.o14575.9
-rw-r--r--  1 korisnik hpc         46 Jun  1 07:55 hello.sh

Job Array

This method is preferred over multiple submissions (e.g. with a for loop) because:

  • reduces job queue load - each job will compete for resources simultaneously with everyone else in the queue, instead of one after the other
  • easier management - modification of all jobs is possible by calling the main (e.g. 14575[]) or individual (e.g. 14575[3]) job identifier 

The environment variables defined by PBS during their execution are:

  • PBS_ARRAY_INDEX - ordinal number of sub-jobs in the job field (e.g. one to nine in the example above)
  • PBS_ARRAY_ID - identifier of the main job field
  • PBS_JOBID - subjob identifier in the job field


Job Description

The PBS system language is used to describe jobs, while the job description file is a standard shell script. In the header of each script, PBS parameters are listed that describe the job in detail, followed by commands to execute the desired application.

Structure of the startup script:

my_job.pbs
#!/bin/bash
 
#PBS -P test_example
#PBS -q cpu
#PBS -e /home/my_directiry
#PBS -l select=2:ncpus=10
 
module load gcc/12.1.0
 
gcc --version


Example of a startup script:

my_job.pbs
#!/bin/bash
 
#PBS -<parametar1> <value>
#PBS -<parametar2> <value>
 
<command>

Basic PBS parameters

OptionArgumentMeaning
-NnameNaming the job
-qdestinationSpecifying job queue or node
-llist_of_resourcesAmount of resources required for the job
-Mlist_of_usersList of users to receive e-mail
-memail_optionsTypes of mail notifications
-opath/to/directoryPath to directory for output file
-epath/to/directoryPath to directory for error file
-j
oeCombining output and error file
-Wgroup_listproject_codeProject code for job


Options for sending notifications by mail with the -m option:

aMail is sent when the batch system terminates the job
bMail is sent when the job starts executing
eMail is sent when the job is done
jMail is sent for sub jobs. Must be combined with one or more sub-options a, b or e


Example mail
#!/bin/bash
 
#PBS -q cpu
#PBS -l select=1:ncpus=2
#PBS -M <name>@srce.hr,<name2>@srce.hr
#PBS -m be
 
echo $PBS_JOBNAME > out
echo $PBS_O_HOST


Two emails were received:

Job start
PBS Job Id: 2686.x3000c0s25b0n0.hsn.hpc.srce.hr
Job Name:   pbs.pbs
Begun execution 
Job finished
PBS Job Id: 2686.x3000c0s25b0n0.hsn.hpc.srce.hr
Job Name:   pbs.pbs
Execution terminated
Exit_status=0
resources_used.cpupercent=0
resources_used.cput=00:00:00
resources_used.mem=0kb
resources_used.ncpus=2
resources_used.vmem=0kb
resources_used.walltime=00:00:01


Options for resources with the -l option:

-l select=3:ncpus=2Option for 3 chunks of a node with 2 cores (6 cores in total)
-l select=1:ncpus=10:mem=20GBOption for 1 chunka of a node with 10 cores i 20GB RAM
-l ngpus=2Option for 2 GPU-s


PBS environmental variables

NCPUSNumber of cores requested. Matches the value from the ncpus option from the PBS script header.
OMP_NUM_THREADSAn OpenMP variable exported by PBS to the environment that is equal to the value of the ncpus option from the PBS script header
PBS_JOBIDIdentifikator posla koji daje PBS kada se posao preda. Stvoreno nakon izvršenja naredbe qsub.
PBS_JOBNAMEJob identifier provided by PBS when a job is submitted. Created after executing the qsub command.
PBS_NODEFILEList of work nodes, or processor cores on which the job is executed
PBS_O_WORKDIRThe working directory in which the job was submitted, or in which the qsub command was invoked.
TMPDIRThe path to the scratch directory.


Setting up working directory

While in PBS pro the path for output and error files is specified in the directory where they are executed, the input and output files of the program itself are by default loaded/saved in the $HOME directory. PBS Pro does not have the option of specifying the job to run in the current directory we are in, so it is necessary to manually change the directory.

After the header it is necessary to write:

cd $PBS_O_WORKDIR

It will redirect the job execution to the directory where the script was run.


  • No labels