...
Info |
---|
SGE also has graphical interface or GUI for access to whole system functionality. GUI starts with qmon command. Use of GUI will not be described because there is no instruction manual within it (Help button). |
Describing jobs
The SGE system language is used to describe jobs, and the job description file (startup script) is a standard shell script. The header of each script lists the SGE parameters that describe the job in detail, followed by the normal commands to execute the user application.
...
Code Block |
---|
Your job <JobID> ("my_job") has been submitted |
Basic SGE parameters
Code Block |
---|
-N <job_name> : the job name that will be displayed when retrieving job information -P <project_name> : the name of the project to which the job belongs -cwd : defines that the directory where the startup script is located is the working directory of the job |
...
Info Note: spaces are not allowed when listing parameter values (eg -l or -q).
Info Detaljan popis i informacije o parametrima moguće je dobiti naredbom
man qsub
.
SGE environment variables
Within the startup script it is possible to use SGE variables. Some of them are:
...
Direct access to the command line of the test node:
Code Block language text qrsh
Interactive command execution:
Code Block language text qrsh /home/user/moja_skripta
Interactive application execution with graphical interface:
Code Block language text qrsh -DISPLAY=10.1.1.1:0.0 <moja_skripta>
Advanced job descriptions
Saving temporary results
It is not recommended to use the $HOME directory to save temporary results generated during job execution. This reduces the efficiency of the application and burdens the front end and the cluster network.
...
Notification will be sent to the address moj@mail.com when the work is interrupted or completed.
Monitoring and management of job performance
Display of job status
SGE's qstat command is used to display job status. The command syntax is:
...
Code Block |
---|
$ qhost -F vendor,scratch,memory |
Job management
The job can also be managed after launch.
...
Code Block |
---|
$ qdel -f <job_ID> |
Access to information about completed jobs
The qacct command is used to retrieve information about completed jobs. The syntax is:
...
Detailed information about all jobs performed on the cluster (caution: large amount of data):
Code Block $ qacct -j
Display of information about all jobs of the defined user:
Code Block $ qacct -j -o <user>
Display a summary of the consumption of computer resources of a defined user (if <user> is not defined, data for all users is displayed):
Code Block $ qacct -o <user>
Display information about all jobs for the defined project:
Code Block $ qacct -j -P <projekt>
Display of the consumption summary of the defined project (if <project> is not defined, data for all projects are displayed):
Code Block $ qacct -P <project>
Cheatsheets
Linux cheatsheet
Navigating the file system
Command | Command description |
---|---|
pwd | Shows the user's current location. The location is displayed as an absolute path to the current directory. |
cd | Changing the current directory (cd - change directory). |
cd - | Return to previous directory. |
Directory management
Command | Command description |
---|---|
mkdir dir1 | Creates directory named dir1. |
mkdir -p /tmp/novi/dir1 | The -p option automatically creates the necessary subdirectories. |
rm -rf dir1/* | Deletes all files and subdirectories inside the directory dir1, ie. leaves the directory dir1 empty. |
rm -rf dir1/ | Deletes all files and subdirectories including dir1. |
...
Copying files and directories
Command | Command description |
---|---|
cp dat1 dat2 | Copies the file dat1 to dat2 (dat1 is unchanged). |
cp dat1 dir/ | Copies the file dat1 to the directory dir. |
cp -r dir1/* dir2/ | Copies all files from directory dir1 to directory dir2 without directory dir1 itself. |
cp -r dir1/ dir2/ | Copies all files and subdirectories in the dir directory to the directory dir2, including the dir1 directory. |
Move and rename files and directories
Command | Command description |
---|---|
mv dat1 dat2 | Renames the file dat1 to dat2. |
mv dat1 dir1 | If dir1 is a directory name, moves the file dat1 to the directory dir1. |
Password change
Command | Command description |
---|---|
passwd | Changing the password of the current user. The command first asks to enter the old password, and then asks to enter the new password (twice). Note: when entering the password, for security reasons, no text is printed in the terminal. |
Auto-fill and search of command history
Command | Command description |
---|---|
[Tab] | Auto-fill of orders. When the user starts typing a command, eg passwd, they can type the first few letters (eg pass) and press the [Tab] key. The shell will then automatically complete the command or print all commands that start with the string pass. File names on the disk can be supplemented in the same way. |
[Ctrl] + [r] | Search command history. In the terminal, hold down the [Ctrl] key and press the [r] key. You start typing letters from a command, and previous commands that contain the letters you type appear. If you want to cycle through all the commands that contain the typed letters, press [Ctrl] + [r] again. |
SGE cheatsheet
Job submit
Command | Command description |
---|---|
qsub | Submits a job and returns job ID. |
Checking job status
Command | Command description |
---|---|
qstat | Shows the status of jobs on the cluster for the current user. |
qstat -f | Shows the status of jobs and nodes. |
qstat -s rpsh | Shows: p - jobs waiting in the queue; r - active jobs; s - temporarily stopped active jobs; h - temporarily stopped jobs in the queue. |
qstat -g c | Displays a summary of the state of individual job queues. |
qstat -j <job_id> | Displays a detailed view of information about one job |
qstat -u <user> | Displays jobs of a specific user ("*"- for all) |
qstat -pe <name> | Displays jobs that use a defined parallel environment. |
qstat -q <queue> | Displays jobs in a defined job queue. |
Stopping jobs
Command | Command description |
---|---|
qdel <job_id> | The job is completely stopped or moved from the queue. |
qdel -u <user> | All jobs of the default user are stopped. |
qdel -f <job_id> | Force stop for stuck jobs. |
Information about completed jobs
Command | Command description |
---|---|
qacct -j <job_id> | Detailed information about the job with ID <job_id>. |
qacct -j | Detailed information about all jobs (a large amount of data). |
qacct -j -o <user> | Display of information about all jobs of a defined user. |
qacct -o <user> | Display of a summary of consumption of a defined user; if <user> is not defined, data for all users is displayed. |
qacct -slots [<count>] | Summary display of all jobs that used the given number of processors; if <count> is not defined, data for all values is displayed. |
qacct -j -P <project> | Display of information about all jobs for the defined project. |
qacct -P <project> | Display of the consumption summary of the defined project; if <project> is not defined, data for all projects is displayed. |
qacct -j -q <queue> | Display of information about all jobs for the defined job order. |
qacct -q <queue> | Display of the consumption summary of the defined job order. |