...
Code Block | ||
---|---|---|
| ||
# submission of job array [korisnik@padobran:~] $ qsub -J 1-10:2 hello.sh 107[].admin # print files content [korisnik@padobran:~] $ ls -l total 5140744 -rw------- 1 korisnik hpc 0 Jun 1 07:44 STDIN.e14571 -rw------- 1 korisnik hpc 12 Jun 1 07:44 STDIN.o14571 -rw------- 1 korisnik hpc 0 Jun 1 08:02 hello.e14572 -rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.1 -rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.3 -rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.5 -rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.7 -rw------- 1 korisnik hpc 0 Jun 1 08:21 hello.e14575.9 -rw------- 1 korisnik hpc 12 Jun 1 08:02 hello.o14572 -rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.1 -rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.3 -rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.5 -rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.7 -rw------- 1 korisnik hpc 12 Jun 1 08:21 hello.o14575.9 -rw-r--r-- 1 korisnik hpc 46 Jun 1 07:55 hello.sh |
Tip | ||
---|---|---|
| ||
This method is preferred over multiple submissions (e.g. with a for loop) because:
The environment variables defined by PBS during their execution are:
|
...
Code Block | ||||
---|---|---|---|---|
| ||||
#!/bin/bash #PBS -P test example #PBS -e /home/my_directory #PBS -q cpu #PBS -l walltime=00:01:00 #PBS -l select=1:ncpus=10 module load mpi/openmpi-x86_64 mpicc --version |
Osnovni PBS parametri
Opcija | Option argument | The meaning of the option |
-N | name | Setting the job name |
-q | destination | Specifying the job queue and/or server |
-l | resource_list | Specifying the resources required to perform the job |
-M | user_list | Setting up a list of mail recipients |
-m | mail_options | Setting the email notification type |
-o | path/to/desired/directory | Setting the name/path where standard output is saved |
-e | path/to/desired/directory | Setting the name/path where the standard error is saved |
-j | oe | Concatenation of standard output and error in the same file |
-Wgroup_list | project_code | Selection of the project under which the job will be performed |
Options for sending notifications by mail option -m:
a | Mail is sent when the batch system terminates the job |
b | Mail is sent when the job starts executing |
e | The mail is sent when the job is finished |
j | Mail is sent for sub jobs. Must be combined with one or more sub-options a, b or e |
Code Block | ||||
---|---|---|---|---|
| ||||
#!/bin/bash #PBS -q cpu #PBS -l walltime=00:01:00 #PBS -l select=1:ncpus=2 #PBS -M <name>@srce.hr,<name2>@srce.hr #PBS -m be echo $PBS_JOBNAME > out echo $PBS_O_HOST |
...
Options for requesting resources with the -l option
-l select=3:ncpus=2 | Requesting 3 chunks with 2 cores each (6 cores in total) |
-l select=1:ncpus=10:mem=20GB | Requesting 1 chunk with 10 cores and 20GB of working memory |
-l ngpus=2 | Requesting 2 gpus |
-l walltime=00:10:00 | Maximum job execution time |
PBS environmental variables
Name | Description |
---|---|
PBS_JOBID | Job identifier provided by PBS when a job is submitted. Created after executing the qsub command |
PBS_JOBNAME | The name of the job provided by the user. The default name is the name of the submitted script |
PBS_NODEFILE | List of work nodes, or processor cores on which the job is executed |
PBS_O_WORKDIR | The working directory in which the job was submitted, i.e. in which qsub command was called |
OMP_NUM_THREADS | An OpenMP variable that PBS exports to the environment, which is equal to the value of the ncpus option from the PBS script header |
NCPUS | Number of cores requested. Matches the value from the ncpus option from the PBS script header |
TMPDIR | Path to temporary directory |
Tip | |||
---|---|---|---|
| |||
While in PBS the path for the output and error files is specified in the directory in which they are executed, the input and output files of the program itself are loaded/saved in the $HOME directory by default. PBS does not have an option to specify the job execution in the current directory we are in, so it is necessary to change the directory manually. If you want to switch to the directory where the script was started, after the header you have to write Dok je u PBS određena putanja za output i error datoteke u direktoriju u kojem se izvode, input i output datoteke samog programa se zadano učitavaju/spremaju u $HOME direktorij. PBS nema opciju određivanja izvođenja posla u trenutnom direktoriju u kojem se nalazimo stoga je potrebno ručno promijeniti direktorij. Ako se želite prebaciti u direktorij u kojem je pokrenuta skripta, poslije zaglavlja potrebno je napisati: cd $PBS_O_WORKDIRAko želite pokretati poslove visokog opterećenja spremišta If you want to run jobs with high storage load (I/O zahtjevni) izvođenje posla ne preporuča se pokretanje iz intensive) job execution is not recommended to run from $PBS_O_WORKDIR-a već sa $TMPDIR lokacije čime će se iskoristiti brzo spremište. U nastavku pročitajte više o korištenju brzog spremišta i privremenim rezultatima. |
Dodjeljivanje resursa poslovima
PBS omogućava definiranje potrebnih resursa na nekoliko načina. Glavna jednica za dodjeljivanje resursa je takozvani "Chunk" ili komad čvora. Chunk se definira s opcijom select. Broj procesorskih jezgri po chunk-u moguće je definirati s ncpus, broj mpi procesa s mpiprocs i količinu radne memorije s mem. Također moguće je definirati walltime (maksimalno vrijeme izvođenja posla) i place (način raspoređivanje chunk-ova po čvorovima).
Ako neki od parametara nisu definirani koristiti će se defaultne vrijednost:
but from $TMPDIR location, which will use fast storage. Read more about using fast storage and temporary results below. |
Allocating resources to jobs
PBS makes it possible to define the necessary resources in several ways. The main unit for resource allocation is the so-called "Chunk" or piece of node. A chunk is defined with the select option. The number of processor cores per chunk can be defined with ncpus, the number of mpi processes with mpiprocs and the amount of working memory with mem. It is also possible to define walltime (maximum job execution time) and place (chunk allocation method by nodes).
If some of the parameters are not defined, the default value will be used:
Parameter | Default value |
---|
select | 1 |
ncpus | 1 |
mpiprocs | 1 |
mem | 3500 MB |
walltime | 48:00:00 |
place | pack |
Kontrola memorije pomoću cgrupa
Memory control using cgroups
In addition to controlling processor usage, cgroups are also set to control memory consumption. This means that jobs run by the user are limited to the requested amount of memory. If the job tries to use more memory than requested in the job description, the system will terminate that job and write the following in the output error fileOsim za kontrolu korištenja procesora, cgrupe postavljene su da kontroliraju i potrošnju memorije. To znači da su poslovi koje korisnik pokreće ograničeni na traženu količinu memorije. Ako posao pokuša iskoristiti više memorije nego je to zatraženo u opisu posla, sustav će prekinuti taj posao i u izlaznu error datoteku zapisati:
Code Block | ||||
---|---|---|---|---|
| ||||
-bash: line 1: PID Killed /var/spool/pbs/mom_priv/jobs/JOB_ID.SC Cgroup mem limit exceeded: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=_JOB_ID,mems_allowed=0,oom_memcg=/pbs_jobs.service/jobid/JOB_ID,task_memcg=/pbs_jobs.service/jobid/JOB_ID,task=JOB_ID,pid=PID,uid=UID |
Kod svakog posla ova poruka bit će malo drugačija, jer sadrži podatke kao što su UID (jedinstvena brojčana oznaka korisnika), PID( brojčana oznaka procesa koji je ubijenFor each job, this message will be slightly different, because it contains information such as UID (Unique Numeric Identification of the User), PID (Numeric Identification of the process that was killed), JOB_ID (Job ID posla koji dodijeljuje assigned by PBS).
Dodjeljivanje po traženom chunku
...
Neke od korištenijih opcija su:
-E | Grupira poslove prema poslužitelju i prikazuje poslove poredane prema uzlaznom ID-u. Kada se qstat prikaže s popisom poslova, poslovi su grupirani po poslužitelju i svaka grupa je prikazana uzlaznim ID-om. Ova opcija također poboljšava performanse qstata. |
-t | Prikazuje informacije o statusu za poslove, nizove poslova i podposlove. |
-p | Prikaz stupca za Iskorišteno vrijeme zamjenjuje se postotkom obavljenog posla. Za posao niza ovo je postotak završenih podposlova. Za normalan posao, to je postotak iskorištenog dodijeljenog CPU vremena. |
-x | Prikazuje informacije o statusu za dovršene i premještene poslove uz poslove u čekanju i pokrenute poslove. |
-Q | Prikazuje status redova u standardnom formatu. |
-q | Prikazuje status redova u alternativnom formatu. |
-f | Prikazuje status posla u alternativnom formatu |
Primjeri korištenja:
Detaljan prikaz posla:
...