SLURM/Priority: Difference between revisions
No edit summary |
|||
Line 24: | Line 24: | ||
which will have lower priority than | which will have lower priority than | ||
<pre>srun --pty --qos=default --mem 1gb --time=01:00:00 bash</pre> | <pre>srun --pty --qos=default --mem 1gb --time=01:00:00 bash</pre> | ||
assuming all three | assuming all three jobs were submitted at the same time. You cannot use negative values for this argument. |
Revision as of 19:38, 27 January 2023
SLURM at UMIACS is configured to prioritize jobs based on a number of factors, termed multifactor priority in SLURM.
These factors include:
- Age of job i.e. time spent waiting to run in the queue
- Partition job was submitted to
- Fair-share of resources
- "Nice" value that job was submitted with
Age
The longer a job is eligible to run but cannot due to all available resources being taken up increases the job's priority to be scheduled as time goes on. The priority modifier for this factor reaches its limit after 7 days.
Partition
The partition named scavenger
on each of our clusters always has a lower priority factor for its jobs than all other partitions on that cluster. As mentioned in other UMIACS cluster-specific documentation, jobs submitted to this partition are also preemptable. These two design choices give the partition its name; jobs submitted to the scavenger
partition "scavenge" for available resources on the cluster rather than consume a dedicated chunk of resources and are interrupted by jobs seeking to consume dedicated chunks.
All other partitions on our clusters have the same priority factor.
Nice value
This is a submission argument that you as the user can include in your jobs to deprioritize them relative to one another. Larger values will deprioritize jobs e.g.,
srun --pty --qos=default --mem 1gb --time=01:00:00 --nice=2 bash
will have lower priority than
srun --pty --qos=default --mem 1gb --time=01:00:00 --nice=1 bash
which will have lower priority than
srun --pty --qos=default --mem 1gb --time=01:00:00 bash
assuming all three jobs were submitted at the same time. You cannot use negative values for this argument.