Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
minLevel0
outlinetrue

Introduction

To schedule and manage jobs on the Parachute computer cluster, PBS (Portable Batch System) is used, which performs job scheduling within the cluster. Its primary task is the distribution of computer tasks, i.e. batch jobs, among the available computer resources.

...

Tip
titleJob array

This method is preferred over multiple submissions (e.g. with a for loop) because:

  • reduces job queue load - each job will compete for resources simultaneously with everyone else in the queue, instead of one after the other
  • easier management - modification of all jobs is possible by calling the main (e.g. 14575[]) or individual (e.g. 14575[3]) job identifier

The environment variables defined by PBS during their execution are:

  • PBS_ARRAY_INDEX - number of sub-jobs in the job array (e.g. one to nine in the example above)
  • PBS_ARRAY_ID - identifier of the main job field
  • PBS_JOBID - subjob identifier in the job field

...

Code Block
languagebash
titlemy_job.pbs
#!/bin/bash

#PBS -P test example
#PBS -e /home/my_directory
#PBS -q cpu
#PBS -l walltime=00:01:00
#PBS -l select=1:ncpus=10

module load mpi/openmpi-x86_64

mpicc --version


Osnovni PBS parametri

OpcijaOption argumentThe meaning of the option
-NnameSetting the job name
-qdestinationSpecifying the job queue and/or server
-lresource_listSpecifying the resources required to perform the job
-Muser_listSetting up a list of mail recipients
-mmail_optionsSetting the email notification type
-opath/to/desired/directorySetting the name/path where standard output is saved
-epath/to/desired/directorySetting the name/path where the standard error is saved
-j
oe Concatenation of standard output and error in the same file
-Wgroup_listproject_codeSelection of the project under which the job will be performed


Options for sending notifications by mail option -m:

aMail is sent when the batch system terminates the job
bMail is sent when the job starts executing
eThe mail is sent when the job is finished
jMail is sent for sub jobs. Must be combined with one or more sub-options a, b or e


Code Block
languagebash
titleEmail example
#!/bin/bash

#PBS -q cpu
#PBS -l walltime=00:01:00
#PBS -l select=1:ncpus=2
#PBS -M <name>@srce.hr,<name2>@srce.hr
#PBS -m be

echo $PBS_JOBNAME > out
echo $PBS_O_HOST

...

Options for requesting resources with the -l option

-l select=3:ncpus=2Requesting 3 chunks with 2 cores each (6 cores in total)
-l select=1:ncpus=10:mem=20GBRequesting 1 chunk with 10 cores and 20GB of working memory
-l ngpus=2Requesting 2 gpus
-l walltime=00:10:00Maximum job execution time

PBS environmental variables

NameDescription
PBS_JOBIDJob identifier provided by PBS when a job is submitted. Created after executing the qsub command
PBS_JOBNAMEThe name of the job provided by the user. The default name is the name of the submitted script
PBS_NODEFILEList of work nodes, or processor cores on which the job is executed
PBS_O_WORKDIRThe working directory in which the job was submitted, i.e. in which qsub command was called
OMP_NUM_THREADSAn OpenMP variable that PBS exports to the environment, which is equal to the value of the ncpus option from the PBS script header
NCPUSNumber of cores requested. Matches the value from the ncpus option from the PBS script header
TMPDIRPath to temporary directory


Tip
titleSpecifying the working directory

While in PBS the path for the output and error files is specified in the directory in which they are executed, the input and output files of the program itself are loaded/saved in the $HOME directory by default. PBS does not have an option to specify the job execution in the current directory we are in, so it is necessary to change the directory manually.

If you want to switch to the directory where the script was started, after the header you have to write:

cd $PBS_O_WORKDIR

If you want to run jobs with high storage load (I/O intensive) job execution is not recommended to run from $PBS_O_WORKDIR-a but from $TMPDIR location, which will use fast storage. Read more about using fast storage and temporary results below.

...

PBS makes it possible to define the necessary resources in several ways. The main unit for resource allocation is the so-called "Chunk" or piece of node. A chunk is defined with the select option. The number of processor cores per chunk can be defined with ncpus, the number of mpi processes with mpiprocs and the amount of working memory with mem. It is also possible to define walltime (maximum job execution time) and place (chunk allocation method by nodes).

If some of the parameters are not defined, the default value will be used:

ParameterDefault value
select1
ncpus1
mpiprocs1
mem

3500 MB

walltime

48:00:00

place

pack

Memory control using cgroups

...

In this case, the user gets 4 processor cores and a total of 14GB of memory on one chunk. When jobs are described without the select option, it is not possible to "chain resources" (separate the required resources with a colon, it is necessary to put the -l option on a new line for each resource)

...

Warning

The temporary directory is automatically deleted when the job is done!

Primjeri korištenja

Usage examples

  1. Example of simple use of $TMPDIR variablePrimjer jednostavnog korištenja $TMPDIR varijable:
    Code Block
    #!/bin/bash
    
    #PBS -q cpu
    #PBS -l walltime=00:00:05
    
    cd $TMPDIR
    pwd > test
    cp test $PBS_O_WORKDIR

  2. Primjer kopiranja ulaznih podataka u $TMPDIR, pokretanje aplikacije, i kopiranje u radni direktorijAn example of copying the input data to $TMPDIR, running the application, and copying it to the working directory:
    Code Block
    #!/bin/bash
    
    #PBS -q cpu
    #PBS -l walltime=00:00:05
    
    # StvaranjaCreating directories direktorijafor zainput ulaznedata podatkein ua privremenomtemporary direktorijudirectory
    mkdir -p $TMPDIR/data
    
    # KopiratiCopy sveall potrebnerequired inputeinputs uto privremenia temporary direktorijdirectory
    cp -r $HOME/data/* $TMPDIR/data
    
    # Pokrenuti aplikaciju i preusmjeriti outpute u "aktualniRun the application and redirect the outputs to the "current" (privremenitemporary) direktorijdirectory
    cd $TMPDIR
    <izvršna<application naredbaexecutable aplikacije>command> 1>output.log 2>error.log
    
    # KopiratiCopy željenidesired output uto radniworking direktorijdirectory
    cp -r /$TMPDIR/output $PBS_O_WORKDIR

...

Parallel jobs

OpenMP

...

parallelization

If your application uses parallelization exclusively at the level of OpenMP threads and cannot expand beyond one worker node (that is, it works with shared memory), you can call the job as shown in the xTB application example below.

Tip

OpenMP applications require the definition of the

Ako Vaša aplikacija koristi paralelizaciju isključivo na razini OpenMP dretvi (engl. threads) i ne može se širiti van jednog radnog čvora (odnosno radi s dijeljenom memorijom), posao možete pozvati na način kako je prikazano u primjeru xTB aplikacije niže.

Tip

OpenMP aplikacije zahtjevaju definiranje varijable OMP_NUM_THREADS.

PBS sustav vodi računa o tome umjesto Vas, te joj pridružuje vrijednost varijable ncpus , definirane u zaglavlju PBS skripte.

The PBS system takes care of this for you, and associates it with the value of the ncpus variable, defined in the header of the PBS script.

If you define jobs using ncpus without the select option, it is preferable to define the amount of memory as well, because otherwise the available working memory will be 3500 MB Ako definirate poslove koristeći ncpus bez opcije select, poželjno je definirati i količinu memorije, jer će u suprotnom dostupna radna memorija iznositi 3500 MB  (select x mem → 1 x 3500 MB).

...

Code Block
languagebash
linenumberstrue
#!/bin/bash

#PBS -q cpu
#PBS -l walltime=10:00:00
#PBS -l ncpus=8:mem=28GB

cd ${PBS_O_WORKDIR}

xtb C2H4BrCl.xyz --chrg 0 --uhf 0 --opt vtight

MPI paralelizacija

Ako Vaša aplikacija koristi paralelizaciju isključivo na razini MPI procesa i može se širiti van jednog radnog čvora (odnosno radi s raspodijeljenom memorijom), posao možete pozvati na način kako je prikazano u primjeru Quantum ESPRESSO aplikacije niže. Za izvođenje aplikacija koje koriste paralelizaciju MPI (ili hibridno MPI+OMP) potrebno je učitati mpi modul  prije pozivanja naredbe mpiexec ili mpirun.

MPI parallelization

If your application uses parallelization exclusively at the MPI process level and can extend beyond a single worker node (that is, it works with distributed memory), you can call the job as shown in the Quantum ESPRESSO application example below. To run applications using MPI (or hybrid MPI+OMP) parallelization, the mpi module must be loaded before calling mpiexec or mpirun.

Tip

The value of the variable select from the header of the PBS script corresponds to the number of the MPI process

Tip

Vrijednost varijable select iz zaglavlja PBS skripte odgovara broju MPI procesa.


Code Block
languagebash
linenumberstrue
#!/bin/bash

#PBS -q cpu
#PBS -l walltime=10:00:00
#PBS -l select=16

module load mpi/openmpi-x86_64

cd ${PBS_O_WORKDIR}

mpiexec pw.x -i calcite.in

MPI + OpenMP (

...

hybrid)

...

parallelization

If your application can be parallelized hybridly, i.e. divide its MPI processes into OpenMP threads, you can call the job as shown in the GROMACS application example belowAko se Vaša aplikacije može paralelizirati hibridno, odnosno dijeliti svoje MPI procese u OpenMP threadove, možete posao pozvati na način kako je prikazano u primjeru GROMACS aplikacije niže:

Tip

OpenMP aplikacije zahtijevaju definiranje varijable applications require the variable OMP_NUM_THREADS to be defined. PBS sustav joj automatski pridružuje vrijednost varijable ncpus , definirane u zaglavlju PBS skripte.The PBS system automatically associates it with the value of the ncpus variable, defined in the header of the PBS script.

The value of the variable select from the header of the PBS script corresponds to the number of the MPI processVrijednost varijable select  iz zaglavlja PBS skripte odgovara broju MPI procesa.


Code Block
languagebash
linenumberstrue
#!/bin/bash

#PBS -q cpu
#PBS -l walltime=10:00:00
#PBS -l select=8:ncpus=4:mem=14GB

module load mpi/openmpi-x86_64

cd ${PBS_O_WORKDIR}

mpiexec -d ${OMP_NUM_THREADS} --cpu-bind depth gmx mdrun -v -deffnm md

Praćenje i upravljanje izvođenja posla

Praćenje posla

Monitoring and management of job performance

Job monitoring

The PBS command qstat is used to display the status of jobs. Command syntax isZa prikaz stanja poslova koristi se PBS-ova naredba qstat. Osnovna sintaksa naredbe je:

Code Block
languagebash
qstat <opcije><options> <ID<job_posla>ID>


Izvršavanjem naredbe qstat bez dodatnih opcija dobiva se ispis svih trenuthi poslova svih korisnikaExecuting the qstat command without additional options displays a printout of all current jobs of all users:

Code Block
languagebash
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
111.admin         mpi+omp_s        kmrkalj           00:36:09 R cpu             


Neke od korištenijih opcija suSome of the more frequently used options are:

Prikaz stupca za Iskorišteno vrijeme zamjenjuje se postotkom obavljenog posla. Za posao niza ovo je postotak završenih podposlova. Za normalan posao, to je postotak iskorištenog dodijeljenog CPU vremena.
-E Grupira poslove prema poslužitelju i prikazuje poslove poredane prema uzlaznom ID-u. Kada se qstat prikaže s popisom poslova, poslovi su grupirani po poslužitelju i svaka grupa je prikazana uzlaznim ID-om. Ova opcija također poboljšava performanse qstata.
-t Prikazuje informacije o statusu za poslove, nizove poslova i podposlove.
-p
-x Prikazuje informacije o statusu za dovršene i premještene poslove uz poslove u čekanju i pokrenute poslove.
-Q Prikazuje status redova u standardnom formatu.
-q Prikazuje status redova u alternativnom formatu.
-f Prikazuje status posla u alternativnom formatu

Primjeri korištenja:

...

Groups jobs by server and displays jobs sorted by ascending ID. When qstat is displayed with a list of jobs, the jobs are grouped by server and each group is displayed by ascending ID. This option also improves the performance of qstat.
-tDisplays status information for jobs, jobs array, and subjobs.
-pThe display of the Time Used column is replaced by the percentage of work done. For a job arrays, this is the percentage of subjobs completed. For normal job, this is a percentage of the allocated CPU time used.
-xDisplays status information for completed and moved jobs in addition to pending and running jobs.
-QShows queue status in standard format.
-qDisplays queue status in an alternative format.
-fDisplays job status in an alternative format


Usage examples:

Detailed job description:

Code Block
languagebash
qstat -fxw 2648

Tracejob naredba vadi i prikazuje log poruke za PBS posao po kronološkom reduThe tracejob command extracts and displays log messages for a PBS job in chronological order.

Code Block
languagebash
tracejob <ID<job_posla>ID>

PrimjerExample:

Code Block
languagebash
$ tracejob 111

Job: 111.admin

03/30/2023 11:23:24  L    Considering job to run
03/30/2023 11:23:24  S    Job Queued at request of mhrzenja@node034, owner =
                          mhrzenja@node034, job name = mapping, queue = cpu
03/30/2023 11:23:24  S    Job Run at request of Scheduler@node034 on exec_vnode
                          (node034:ncpus=40:mem=104857600kb)
03/30/2023 11:23:24  L    Job run
03/30/2023 11:23:24  S    enqueuing into cpu, state Q hop 1
03/30/2023 11:23:56  S    Holds u set at request of mhrzenja@node034
03/30/2023 11:24:22  S    Holds u released at request of mhrzenja@node034

Upravljanje poslovima

Poslom se može upravljati i nakon pokretanja.

Job management

The job can be managed even after it has started.

While the job is in the queue, it is possible to temporarily stop its execution with the command:Dok je posao u redu čekanja, moguće je privremeno zaustaviti njegovo izvršavanje naredbom:

Code Block
languagebash
qhold <ID<job_posla>ID>

To return to the queueVraćanje natrag na red čekanja:

Code Block
languagebash
qrls <ID<job_posla>ID>

The job is completely stopped or unqueued with the commandPosao se u potpunosti zaustavlja ili miče iz reda čekanja naredbom:

Code Block
qdel <ID<job_posla>ID>

Force stop should be used for stuck jobsZa zaglavljene poslove treba koristiti prisilno zaustavljanje:

Code Block
languagebash
qdel -W force -x <ID<job_posla>

Odgađanje izvođenja

PBS pruža mogućnost izvođenja poslova u ovisnosti o drugima, što je korisno u slučajevima poput:

  • izvršavanje poslova ovisi o izlazu ili stanju prethodno izvršenog
  • aplikacija zahtijeva sekvencijalno izvođenje raznih komponenata
  • ispis podataka jednog posla može ugroziti izvođenje drugog
ID>

Delay of execution

PBS provides the ability to perform jobs in dependence on others, which is useful in cases such as:

  • the execution of jobs depends on the output or state of the previously executed
  • the application requires the sequential execution of various components
  • printing data from one job may compromise the execution of another

The directive that enables this functionality when submitting a job immediately isDirektiva koja omogućuje ovu funkcionalnost pri trenutnom podnošenju posla je:

Code Block
languagebash
qsub -W depend=<tip><type>:<ID<job_posla>ID>[:<ID<job_posla>ID>] ...

Gdje < tip> može bitiWhere < type> can be:

  • after* - pokretanje trenutnog s obzirom na ostale
  • after - izvršavanje trenutnog nakon početka izvršavanja navedenih
  • afterok - izvršavanje trenutnog nakon uspješnog završetka navedenih
  • afternotok -izvršavanje trenutnognakon greške u završetku navedenih
  • afterany - izvršavanje trenutnog nakon završetka navedenih
  • before* - pokretanje ostalih s obzirom na trenutni before - pokretanjenavedenih nakon početka trenutnog
    beforeok - pokretanjenavedenih nakon usprešnog završetka trenutnog
    beforenotok - pokretanjenavedenih nakon greške u izvršavanju trenutnog beforeany - pokretanjenavedenih nakon završetka trenutnog
  • on:<broj> - izvršavanje posla koji će ovisiti o naknadno navedenom broju before*   tipa poslova
Note

Posao s direktivom -W depend=... neće biti podnesen ako navedeni ID-ovi poslova ne postoje (iliti, ako nisu u redu čekanja)

Primjeri

  • starting the current one with respect to the others
    • after - execution of the current one after the start of execution of the specified ones
    • afterok - execution of the current one after the successful completion of the specified ones
    • afternotok -execution of the current after an error in the completion of the specified
    • afterany - execution of the current one after the end of the specified ones
  • before* - starting the others with respect to the current one
    • before - execution of the specified ones after the start of the current one
    • beforeok - execution of the specified ones after the successful completion of the current one
    • beforenotok - execution of the specified ones after the an error in the completion of the current one
    • beforeany - execution of the specified ones after the end of the current one
  • on:<number> - execution of a job that will depend on the subsequently specified number of before* types of jobs
Note
A job with a directive -W depend=... will not be submitted if the specified job IDs do not exist (or if they are not in a queue)

Usage examples:

If we want  posao1 to start after successful completion of Ako želimo da posao1 započne nakon uspješnog završetka posao0:

Code Block
[korisnik@padobran] $ qsub posao0
1000.admin

[korisnik@padobran] $ qsub -W depend=afterok:1000 posao1
1001.admin

[korisnik@padobran] $ qstat 1000 1001
Job id                 Name             User              Time Use S Queue
---------------------  ---------------- ----------------  -------- - -----
1000.admin             posao0           korisnik           00:00:00 R cpu             
1001.admin             posao1           korisnik                  0 H cpu


Ako želimo da posao0 započne tek nakon uspješnog završetka If we want posao0 to start after successful completion of  posao1 :

Code Block
[korisnik@padobran] $ qsub -W depend=on:1 posao0
1002.admin
[korisnik@padobran] $ qsub -W depend=beforeok:1002 posao1
1003.admin

[korisnik@padobran] $ qstat 1002 1003
Job id                 Name             User              Time Use S Queue
---------------------  ---------------- ----------------  -------- - -----
1002.admin             posao0           korisnik                  0 H cpu             
1003.admin             posao1           korisnik           00:00:00 R cpu

...