Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Info

SGE also has graphical interface or GUI for access to whole system functionality. GUI starts with qmon command. Use of GUI will not be described because there is no instruction manual within it (Help button).

Describing jobs

The SGE system language is used to describe jobs, and the job description file (startup script) is a standard shell script. The header of each script lists the SGE parameters that describe the job in detail, followed by the normal commands to execute the user application.

...

Code Block
Your job <JobID> ("my_job") has been submitted

Basic SGE parameters


Code Block
-N <job_name> : the job name that will be displayed when retrieving job information
-P <project_name> : the name of the project to which the job belongs
-cwd : defines that the directory where the startup script is located is the working directory of the job

...


  • Info

    Note: spaces are not allowed when listing parameter values ​​(eg -l or -q).



  • Info

    Detaljan popis i informacije o parametrima moguće je dobiti naredbom man qsub.


SGE environment variables

Within the startup script it is possible to use SGE variables. Some of them are:

...

  1. Direct access to the command line of the test node:

    Code Block
    languagetext
    qrsh


  2. Interactive command execution:


    Code Block
    languagetext
    qrsh /home/user/moja_skripta



  3. Interactive application execution with graphical interface:

    Code Block
    languagetext
    qrsh -DISPLAY=10.1.1.1:0.0 <moja_skripta>


Advanced job descriptions

Saving temporary results

It is not recommended to use the $HOME directory to save temporary results generated during job execution. This reduces the efficiency of the application and burdens the front end and the cluster network.

...

Notification will be sent to the address moj@mail.com when the work is interrupted or completed.


Monitoring and management of job performance

Display of job status

SGE's qstat command is used to display job status. The command syntax is:

...

Code Block
$ qhost -F vendor,scratch,memory

Job management

The job can also be managed after launch.

...

Code Block
$ qdel -f <job_ID>


Access to information about completed jobs

The qacct command is used to retrieve information about completed jobs. The syntax is:

...

  1. Detailed information about all jobs performed on the cluster (caution: large amount of data):

    Code Block
    $ qacct -j


  2. Display of information about all jobs of the defined user:

    Code Block
    $ qacct -j -o <user>

    Display a summary of the consumption of computer resources of a defined user (if <user> is not defined, data for all users is displayed):

    Code Block
    $ qacct -o <user>


  3. Display information about all jobs for the defined project:

    Code Block
    $ qacct -j -P <projekt>

    Display of the consumption summary of the defined project (if <project> is not defined, data for all projects are displayed):

    Code Block
    $ qacct -P <project>


Cheatsheets

Linux cheatsheet

Navigating the file system

CommandCommand description
pwd

Shows the user's current location. The location is displayed as an absolute path to the current directory.

cd

Changing the current directory (cd - change directory).

cd -

Return to previous directory.


Directory management

CommandCommand description
mkdir dir1

Creates directory named dir1.

mkdir -p /tmp/novi/dir1

The -p option automatically creates the necessary subdirectories.

rm -rf dir1/*

Deletes all files and subdirectories inside the directory dir1, ie. leaves the directory dir1 empty.

rm -rf dir1/

Deletes all files and subdirectories including dir1.

...


Copying files and directories

CommandCommand description
cp dat1 dat2

Copies the file dat1 to dat2 (dat1 is unchanged).

cp dat1 dir/

Copies the file dat1 to the directory dir.

cp -r dir1/* dir2/

Copies all files from directory dir1 to directory dir2 without directory dir1 itself.

cp -r dir1/ dir2/

Copies all files and subdirectories in the dir directory to the directory dir2, including the dir1 directory.


Move and rename files and directories

CommandCommand description
mv dat1 dat2

Renames the file dat1 to dat2.

mv dat1 dir1

If dir1 is a directory name, moves the file dat1 to the directory dir1.


Password change

CommandCommand description
passwd

Changing the password of the current user. The command first asks to enter the old password, and then asks to enter the new password (twice). Note: when entering the password, for security reasons, no text is printed in the terminal.


Auto-fill and search of command history

CommandCommand description
[Tab]

Auto-fill of orders. When the user starts typing a command, eg passwd, they can type the first few letters (eg pass) and press the [Tab] key. The shell will then automatically complete the command or print all commands that start with the string pass. File names on the disk can be supplemented in the same way.

[Ctrl] + [r]

Search command history. In the terminal, hold down the [Ctrl] key and press the [r] key. You start typing letters from a command, and previous commands that contain the letters you type appear. If you want to cycle through all the commands that contain the typed letters, press [Ctrl] + [r] again.


SGE cheatsheet

Job submit

CommandCommand description
qsubSubmits a job and returns job ID.


Checking job status

CommandCommand description
qstatShows the status of jobs on the cluster for the current user.
qstat -fShows the status of jobs and nodes.
qstat -s rpshShows: p - jobs waiting in the queue; r - active jobs; s - temporarily stopped active jobs; h - temporarily stopped jobs in the queue.
qstat -g cDisplays a summary of the state of individual job queues.
qstat -j <job_id>Displays a detailed view of information about one job
qstat -u <user>Displays jobs of a specific user ("*"- for all)
qstat -pe <name>Displays jobs that use a defined parallel environment.
qstat -q <queue>Displays jobs in a defined job queue.


Stopping jobs

CommandCommand description
qdel <job_id>The job is completely stopped or moved from the queue.
qdel -u <user>All jobs of the default user are stopped.
qdel -f <job_id>Force stop for stuck jobs.


Information about completed jobs

CommandCommand description
qacct -j <job_id>Detailed information about the job with ID <job_id>.
qacct -jDetailed information about all jobs (a large amount of data).
qacct -j -o <user>Display of information about all jobs of a defined user.
qacct -o <user>Display of a summary of consumption of a defined user; if <user> is not defined, data for all users is displayed.
qacct -slots [<count>]Summary display of all jobs that used the given number of processors; if <count> is not defined, data for all values ​​is displayed.
qacct -j -P <project>Display of information about all jobs for the defined project.
qacct -P <project>Display of the consumption summary of the defined project; if <project> is not defined, data for all projects is displayed.
qacct -j -q <queue>Display of information about all jobs for the defined job order.
qacct -q <queue>Display of the consumption summary of the defined job order.