Quality of Service (QOS)

Each job submitted to Slurm will be assigned a Quality of Service (QOS). QOS determines the job’s priority (which affects how soon the job starts), specifies whether it can interrupt other jobs if there are no free resources, may set certain resource limits (e.g maximum number of CPUs, maximum running time, etc.) and can also affect billing rates (for example, twice as much or half as much CPU time is billed as used).

Important

On the Komondor, there are three QOSs defined: “normal”, “lowpri” and “high”. Due to current workload of the cluster, these QOSs do not yet differ in terms of billing rates. This may change in the future as the load continues to increase. Please check the actual cluster settings before you plan to use manual QOS settings.

Checking Available QOSs

QOSs Defined on the Cluster

All QOS definitions that exist on the cluster can be listed with the following command:

$ sacctmgr list qos

You can customize the output by specifying a comma-separated list of the required field names as a format option. Important properties of the current QOS definitions on the Komondor are:

$ sacctmgr list qos format=Name,Priority,PreemptMode,UsageFactor,Preempt

      Name   Priority  GraceTime PreemptMode UsageFactor    Preempt
---------- ---------- ---------- ----------- ----------- ----------
    normal       1000   00:01:00     cluster    1.000000
    lowpri        100   00:01:00     requeue    1.000000
      high      10000   00:00:00     cluster    1.000000

As the output above shows, each QOSs assigns a different priority to a submitted job (in favor of “high” jobs) but the billing rate (”UsageFactor”) is the same for all QOS. As the empty “Preempt” list indicates, none of the QOS gives the possibility to interrupt a running job. If this were not the case, the “PreemptMode” shows the mechanism that would used to preempt jobs of the given QOS. Interrupted jobs would have the time specified in the “GraceTime” field to prepare for the interruption (e.g. save their state).

For more information on the available fields, see the “Show QOS Format Options” section in the ‘sacctmgr’ manual or see the manpages for sacctmgr (type man sacctmgr in the terminal).

QOSs Available to Your Account

On the Komondor, currently the “lowpri” and “normal” QOSs are available to the projects by default and the default QOS (the QOS the job gets if the user did not set it explicitly) is “normal”.

You can check the QOSs that are available to your account / user and the default QOS with the following command:

$ sacctmgr list assoc account=$project format=Cluster,Account,User,QOS,DefaultQOS

   Cluster    Account       User                  QOS   Def QOS
---------- ---------- ---------- -------------------- ---------
  komondor   research                   lowpri,normal    normal
  komondor   research      alice        lowpri,normal    normal

where the value of the $project variable is the name of your project (account) and the required information is selected by adding the QOS and DefaultQOS field names to the output format specification.

Default QOS

On the Komondor, the default QOS is “normal”.

Manual QOS

You can specify a certain QOS other than the default using the --qos= option to the “sbatch”, “salloc” and “srun” commands, or as an #SBATCH directive in the sbatch script. For example, to set the QOS to “lowpri”, specify the following directive in the sbatch script:

#SBATCH --qos=lowpri

For more information on the QOS, see the “Quality of Service (QOS)” section in the Slurm documentation