SLURM/Priority: Difference between revisions

Latest revision as of 16:30, 25 November 2025

SLURM at UMIACS is configured to prioritize jobs based on a number of factors, termed multifactor priority in SLURM. Each job submitted to the scheduler is assigned a priority value, which can be viewed in the output of scontrol show job <jobid>.

Example:

$ scontrol show job 1
JobId=1 JobName=bash
   UserId=username(13337) GroupId=username(13337) MCS_label=N/A
   Priority=2000841 Nice=0 Account=nexus QOS=default
...

Pending Jobs

If the partition that you submit your job to cannot begin your job instantly due to no compute node(s) in the partition having the resources free to run it, your job will remain in the Pending state with the listed reason (Resources). If there is another job already pending with this reason, you submit a job to the same partition, and your job gets assigned a lower priority value than that pending job, your job will instead remain in the Pending state with reason (Priority). If there are multiple jobs pending and your job is not the highest priority job pending, the scheduler will only begin execution of your job if starting your job would not push the begin times for any higher priority jobs in the same partition further back.

Lowering some combination of the resources you are requesting and/or the time limit may allow submitted jobs to run more quickly or instantly during times where a partition is under resource pressure. The command squeue -j <jobid> --start can be used to provide a time estimate for when your job will start, where <jobid> is the job ID you receive from either srun or sbatch. This time is subject to change depending on if other users' jobs end sooner or more jobs get submitted.

You can use the command alias show_available_nodes with a variety of different submission arguments to get a better idea of what jobs may be able to begin sooner, but the output of this command alias is not definitive, for reasons mentioned in the footnotes on the page linked to.

Priority Factors

The priority factors in use at UMIACS are, from most-heavily to least-heavily weighted:

Partition job was submitted to
Fair-share of resources within SLURM account
Age of job, i.e., time spent waiting to run in the queue
Association/SLURM account being used
"Nice" value that job was submitted with

Partition

The partitions whose names are or are prefixed with scavenger on our clusters are always in a lower priority tier and always have lower priority factors for their jobs than all other partitions on that cluster. As mentioned in other UMIACS cluster-specific documentation, jobs submitted to these partitions are also preemptable. These two design choices give the partitions their names; jobs submitted to scavenger named or prefixed partitions "scavenge" for available resources on the cluster rather than consume dedicated resources, and are interrupted by jobs asking to consume dedicated resources.

On Nexus, labs/centers may also have their own scavenger partitions, i.e., <labname>-scavenger, if the faculty for the lab/center have decided upon some sort of limit on jobs, such as number of simultaneous jobs, number of actively consumed billing resources, etc., in their non-scavenger partitions. These lab/center scavenger partitions allow for more jobs to be run by members of that lab/center on that lab's/center's nodes only, but jobs on these partitions are preemptable by jobs in that lab's/center's non-scavenger partitions and/or account-specific partitions, if any account-specific partitions containing a given node exist. Jobs submitted to lab/center scavenger partitions will preempt jobs submitted to the institute-wide scavenger partitions (running on nodes that are also in those lab/center scavenger partitions).

In decreasing order of priority (highest first), our priority tiers for partitions are:

Priority access account-specific partitions
Account-specific partitions
Lab/center-specific and institute-wide non-"scavenger" named partitions
Lab/center-specific "scavenger" named partitions
Institute-wide "scavenger" named partitions

A job in a specific priority tier will never have a higher priority value than any job in a higher priority tier. Corresponding to the above tiers, the priority values that you will see for jobs in each tier:

>= 4000000
3000000 to 3999999
2000000 to 2999999
1000000 to 1999999
< 1000000

As such, jobs on specific nodes in some non-"scavenger" named partitions may also be subject to preemption based on these priority tiers. Generally speaking, though, most nodes are only in one partition in one of the first three (non-"scavenger") priority tiers, and then in an institute-wide "scavenger" named partition, and a lab/center-specific "scavenger" named partition, if one exists for the lab/center that a given node is a part of.

Fair-share

The more resources your jobs have already consumed within an account, the lower priority factor your future jobs will have when compared to other users' jobs in the same account who have used fewer resources (so as to "fair-share" with other users). Additionally, if there are multiple accounts that can submit to a partition, and the sum of resources of all users' jobs within account A is greater than the sum of resources of all users' jobs within account B, the lower priority factor all future jobs from users in account A will have when compared to all future jobs from users in account B. (In other words, fair-share is hierarchical.)

You can view the various fair-share statistics with the command sshare -l. It will show your specific FairShare values (always between 0.0 and 1.0) within accounts that you have access to. You can also view other accounts' Level Fairshare (LevelFS).

Account                    User  RawShares  NormShares    RawUsage   NormUsage  EffectvUsage  FairShare    LevelFS                    GrpTRESMins                    TRESRunMins
-------------------- ---------- ---------- ----------- ----------- ----------- ------------- ---------- ---------- ------------------------------ ------------------------------
root                                          0.000000 68444174744                  1.000000                                                      cpu=4797787,mem=70530109515,e+
 cbcb-heng                               1    0.028571  2041564795    0.029831      0.029831              0.957779                                cpu=107606,mem=2890646050,ene+
 cbcb                                    1    0.028571  4454658377    0.065046      0.065046              0.439246                                cpu=452139,mem=22276633804,en+
 class                                   1    0.028571   255617290    0.003733      0.003733              7.652841                                cpu=7021,mem=74554606,energy=+
 clip                                    1    0.028571  3057933838    0.044674      0.044674              0.639549                                cpu=33214,mem=2744443460,ener+
 cml-abhinav                             1    0.028571    39726437    0.000580      0.000580             49.220844                                cpu=0,mem=0,energy=0,node=0,b+
 cml-director                            1    0.028571   672692955    0.009829      0.009829              2.906778                                cpu=65977,mem=1080976998,ener+
 cml-furongh                             1    0.028571   765366431    0.011183      0.011183              2.554814                                cpu=78802,mem=1513011200,ener+
 cml-hajiagha                            1    0.028571    12397205    0.000181      0.000181            157.726572                                cpu=0,mem=0,energy=0,node=0,b+
 cml-ramani                              1    0.028571           0    0.000000      0.000000            1.3575e+11                                cpu=0,mem=0,energy=0,node=0,b+
 cml-scavenger                           1    0.028571  2564733604    0.037475      0.037475              0.762406                                cpu=370867,mem=3568986794,ene+
 cml-sfeizi                              1    0.028571    40715178    0.000595      0.000595             48.025548                                cpu=0,mem=0,energy=0,node=0,b+
 cml-tokekar                             1    0.028571    26458249    0.000387      0.000387             73.903937                                cpu=23240,mem=178488320,energ+
 cml-tomg                                1    0.028571       82800    0.000001      0.000001            2.3615e+04                                cpu=0,mem=0,energy=0,node=0,b+
 cml-wriva-high                          1    0.028571           0    0.000000      0.000000                   inf                                cpu=0,mem=0,energy=0,node=0,b+
 cml-wriva                               1    0.028571  1047428702    0.015305      0.015305              1.866828                                cpu=5366,mem=137386666,energy+
 cml                                     1    0.028571    66866114    0.000975      0.000975             29.299389                                cpu=1796,mem=29426756,energy=+
 csd-sarahwie                            1    0.028571   710055035    0.007034      0.007034              4.061956                                cpu=0,mem=0,energy=0,node=0,b+
 gamma                                   1    0.028571  2609474948    0.038129      0.038129              0.749334                                cpu=34089,mem=360373862,energ+
 mbrc                                    1    0.028571    73411964    0.001073      0.001073             26.635560                                cpu=1195,mem=4896358,energy=0+
 mc2                                     1    0.028571     2682557    0.000039      0.000039            728.919551                                cpu=0,mem=0,energy=0,node=0,b+
 nexus                                   1    0.028571  5472794067    0.079964      0.079964              0.357302                                cpu=278464,mem=3250599000,ene+
  nexus                username          1    0.000835       69666    0.000001      0.000021   0.457407  37.435501                                cpu=0,mem=0,energy=0,node=0,b+
 oasis                                   1    0.028571      330030    0.000005      0.000005            5.9248e+03                                cpu=0,mem=0,energy=0,node=0,b+
 quics                                   1    0.028571           4    0.000000      0.000000            4.1683e+08                                cpu=0,mem=0,energy=0,node=0,b+
 scavenger                               1    0.028571 40888195964    0.597419      0.597419              0.047825                                cpu=3142204,mem=29902903931,e+
  scavenger            username          1    0.000835         171    0.000000      0.000000   0.033975 9.8885e+04                                cpu=0,mem=0,energy=0,node=0,b+
 vulcan-abhinav                          1    0.028571  1742123078    0.025452      0.025452              1.122544                                cpu=26166,mem=781842841,energ+
 vulcan-djacobs                          1    0.028571      912637    0.000013      0.000013            2.1425e+03                                cpu=0,mem=0,energy=0,node=0,b+
 vulcan-jbhuang                          1    0.028571   283430750    0.004141      0.004141              6.898930                                cpu=172,mem=5661764,energy=0,+
 vulcan-metzler                          1    0.028571   262070179    0.003829      0.003829              7.461241                                cpu=0,mem=0,energy=0,node=0,b+
 vulcan-rama                             1    0.028571           0    0.000000      0.000000            1.6942e+14                                cpu=0,mem=0,energy=0,node=0,b+
 vulcan-ramani                           1    0.028571   681333557    0.009955      0.009955              2.869914                                cpu=22188,mem=568033280,energ+
 vulcan-yaser                            1    0.028571      397309    0.000006      0.000006            4.9215e+03                                cpu=0,mem=0,energy=0,node=0,b+
 vulcan-zwicker                          1    0.028571   133204932    0.001946      0.001946             14.679402                                cpu=0,mem=0,energy=0,node=0,b+
 vulcan                                  1    0.028571  1247236491    0.018224      0.018224              1.567761                                cpu=147273,mem=1161243818,ene+

The actual resource billing weights for the three main resources (memory per GB, CPU cores, and number of GPUs if applicable) are per-partition and can be viewed in the TRESBillingWeights line in the output of scontrol show partition. The billing value for a job is the sum of all resource weightings for resources the job has requested. This value is then multiplied by the amount of time a job has run in seconds to get the amount it contributes to the RawUsage for the association within the account it is running under.

Algorithm

The algorithm we use for resource weightings differs depending on if there are any GPUs in a partition or not, and is as follows:

GPU partitions

Each resource (memory/CPU/GPU) is given a weighting value such that their relative billings to each other within the partition are equal (33.33% each). Memory is typically always the most abundant resource by unit (weighting value of 1.0 per GB) and the CPU/GPU values are adjusted accordingly.

Different GPU types may also be weighted differently within the GPU relative billing. A baseline GPU type is first chosen. All GPUs of that type and other types that have lower FP32 performance (in TFLOPS) are given a weighting factor of 1.0. GPU types with higher FP32 performance than the baseline GPU are given a weighting factor calculated by dividing their FP32 performance by the baseline GPU's FP32 performance. The weighting values for each GPU type are then determined by normalizing the sum of all of GPU cards' billing values multiplied by their weighting factors against the relative billing percentage for GPUs (33.33%).

The current baseline GPU is the NVIDIA RTX A4000.

CPU-only partitions

Each resource (memory/CPU) is first given a weighting value such that their relative billings to each other within the partition are equal (50% each). Memory is typically always the most abundant resource by unit (weighting value of 1.0 per GB) and the CPU value is adjusted accordingly. The final CPU weight value is then divided by 10, which ends up translating to roughly 90.9% of the billing weight being for memory and 9.1% being for CPU. The division of the CPU value is done so as to not affect accounts' fair-share priority factors as much when running CPU-only jobs given the popularity of GPGPU computing.

Age

The longer a job is eligible to run but cannot due to resources being unavailable or having a lower priority value than one or more other jobs, the higher the job's priority becomes as it continue to wait in the queue. This is the only priority modifier that can change a job's priority value once it has been submitted, and the priority modifier for this factor reaches its limit after 7 days.

Association

Some lab/center-specific SLURM accounts may have priority values directly attached to them. Jobs run under these accounts gain this many extra points of priority.

Nice value

This is a submission argument that you as the user can include when submitting your jobs to deprioritize them. Larger values will deprioritize jobs more, e.g.,

srun --pty --nice=2 bash

will have lower priority than

srun --pty --nice=1 bash

which will have lower priority than

srun --pty bash

assuming all three jobs were submitted at the same time. You cannot use negative values for this argument.

Because this value is absolute, if you want to use it, we would recommend only using small numbers - one or two digits only. Larger numbers may impact your job's ability to run at all based on the other factors at play.

@@ Line 1: / Line 1: @@
-[[SLURM]] at UMIACS is configured to prioritize jobs based on a number of factors, termed [https://slurm.schedmd.com/priority_multifactor.html multifactor priority] in SLURM.
+[[SLURM]] at UMIACS is configured to prioritize jobs based on a number of factors, termed [https://slurm.schedmd.com/priority_multifactor.html multifactor priority] in SLURM. Each job submitted to the scheduler is assigned a priority value, which can be viewed in the output of <code>scontrol show job <jobid></code>.
-These factors include:
+Example:
-* Age of job i.e. time spent waiting to run in the queue
+<pre>
+$ scontrol show job 1
+JobId=1 JobName=bash
+   UserId=username(13337) GroupId=username(13337) MCS_label=N/A
+   Priority=2000841 Nice=0 Account=nexus QOS=default
+...
+</pre>
+==Pending Jobs==
+If the partition that you submit your job to cannot begin your job instantly due to no compute node(s) in the partition having the resources free to run it, your job will remain in the Pending state with the listed reason <tt>(Resources)</tt>. If there is another job already pending with this reason, you submit a job to the same partition, and your job gets assigned a lower priority value than that pending job, your job will instead remain in the Pending state with reason <tt>(Priority)</tt>. If there are multiple jobs pending and your job is not the highest priority job pending, the scheduler will only begin execution of your job if starting your job would not push the begin times for any higher priority jobs in the same partition further back.
+Lowering some combination of the resources you are requesting and/or the time limit may allow submitted jobs to run more quickly or instantly during times where a partition is under resource pressure. The command <code>squeue -j <jobid> --start</code> can be used to provide a time estimate for when your job will start, where <jobid> is the job ID you receive from either srun or sbatch. This time is subject to change depending on if other users' jobs end sooner or more jobs get submitted.
+You can use the command alias <code>[[SLURM/JobSubmission#show_available_nodes | show_available_nodes]]</code> with a variety of different submission arguments to get a better idea of what jobs may be able to begin sooner, but the output of this command alias is not definitive, for reasons mentioned in the footnotes on the page linked to.
+==Priority Factors==
+The priority factors in use at UMIACS are, from most-heavily to least-heavily weighted:
 * Partition job was submitted to
-* Fair-share of resources
+* Fair-share of resources within SLURM account
+* Age of job, i.e., time spent waiting to run in the queue
+* Association/SLURM account being used
 * "Nice" value that job was submitted with
-==Age==
+===Partition===
-The longer a job is eligible to run but cannot due to all available resources being taken up increases the job's priority to be scheduled as time goes on. The priority modifier for this factor reaches its limit after 7 days.
+The partitions whose names are or are prefixed with <code>scavenger</code> on our clusters are always in a lower priority tier and always have lower priority factors for their jobs than all other partitions on that cluster. As mentioned in other UMIACS cluster-specific documentation, jobs submitted to these partitions are also [https://slurm.schedmd.com/preempt.html preemptable]. These two design choices give the partitions their names; jobs submitted to <code>scavenger</code> named or prefixed partitions "scavenge" for available resources on the cluster rather than consume dedicated resources, and are interrupted by jobs asking to consume dedicated resources.
-==Partition==
+On [[Nexus]], labs/centers may also have their own scavenger partitions, i.e., <code><labname>-scavenger</code>, if the faculty for the lab/center have decided upon some sort of limit on jobs, such as number of simultaneous jobs, number of actively consumed billing resources, etc., in their non-scavenger partitions. These lab/center scavenger partitions allow for more jobs to be run by members of that lab/center on that lab's/center's nodes only, but jobs on these partitions are preemptable by jobs in that lab's/center's non-scavenger partitions and/or account-specific partitions, if any account-specific partitions containing a given node exist. Jobs submitted to lab/center scavenger partitions will preempt jobs submitted to the institute-wide scavenger partitions (running on nodes that are also in those lab/center scavenger partitions).
-The partition named <code>scavenger</code> on each of our clusters always has a lower priority factor for its jobs than all other partitions on that cluster. As mentioned in other UMIACS cluster-specific documentation, jobs submitted to this partition are also [https://slurm.schedmd.com/preempt.html preemptable]. These two design choices give the partition its name; jobs submitted to the <code>scavenger</code> partition "scavenge" for available resources on the cluster rather than consume a dedicated chunk of resources and are interrupted by jobs seeking to consume dedicated chunks.
-All other partitions on our clusters have the same priority factor.
+In decreasing order of priority (highest first), our priority tiers for partitions are:
+# Priority access account-specific partitions
+# Account-specific partitions
+# Lab/center-specific and institute-wide non-"scavenger" named partitions
+# Lab/center-specific "scavenger" named partitions
+# Institute-wide "scavenger" named partitions
-==Fair-share==
+A job in a specific priority tier will never have a higher priority value than any job in a higher priority tier. Corresponding to the above tiers, the priority values that you will see for jobs in each tier:
-The more resources your jobs have already consumed within an account, the lower priority factor your future jobs will have when compared to other users' jobs in the same account who have used fewer resources (so as to "fair-share" with other users). Additionally, if there are multiple accounts that can submit to a partition, and the sum of resources of all users' jobs within account A is greater than the sum of resources of all users' jobs within account B, the lower priority factor all future jobs from users in account A will have when compared to all future jobs from users in account B.
+# >= 4000000
+# 3000000 to 3999999
+# 2000000 to 2999999
+# 1000000 to 1999999
+# < 1000000
-You can view the various fair-share statistics with the command <code>sshare -l</code>. It will show you your specific FairShare values (always between 0.0 and 1.0) within accounts that you have access to. You can also view other accounts' Level Fairshare (LevelFS).
+As such, '''jobs on specific nodes in some non-"scavenger" named partitions may also be subject to preemption''' based on these priority tiers. Generally speaking, though, most nodes are only in one partition in one of the first three (non-"scavenger") priority tiers, and then in an institute-wide "scavenger" named partition, and a lab/center-specific "scavenger" named partition, if one exists for the lab/center that a given node is a part of.
+===Fair-share===
+The more resources your jobs have already consumed within an account, the lower priority factor your future jobs will have when compared to other users' jobs in the same account who have used fewer resources (so as to "fair-share" with other users). Additionally, if there are multiple accounts that can submit to a partition, and the sum of resources of all users' jobs within account A is greater than the sum of resources of all users' jobs within account B, the lower priority factor all future jobs from users in account A will have when compared to all future jobs from users in account B. (In other words, fair-share is hierarchical.)
+You can view the various fair-share statistics with the command <code>sshare -l</code>. It will show your specific FairShare values (always between 0.0 and 1.0) within accounts that you have access to. You can also view other accounts' Level Fairshare (LevelFS).
 <pre>
 Account                    User  RawShares  NormShares    RawUsage   NormUsage  EffectvUsage  FairShare    LevelFS                    GrpTRESMins                    TRESRunMins
 -------------------- ---------- ---------- ----------- ----------- ----------- ------------- ---------- ---------- ------------------------------ ------------------------------
-root                                          0.000000 13357781570                  1.000000                                                      cpu=994689,mem=8706484555,ene+
+root                                          0.000000 68444174744                  1.000000                                                      cpu=4797787,mem=70530109515,e+
-  cbcb                                    1    0.111111    26568079    0.001990      0.001990             55.826073                                cpu=581,mem=76242397,energy=0+
+ cbcb-heng                               1    0.028571  2041564795    0.029831      0.029831              0.957779                                cpu=107606,mem=2890646050,ene+
-  class                                   1    0.111111    71647791    0.005367      0.005367             20.701148                                cpu=0,mem=0,energy=0,node=0,b+
+  cbcb                                    1    0.028571  4454658377    0.065046      0.065046              0.439246                                cpu=452139,mem=22276633804,en+
-  clip                                    1    0.111111   985905301    0.073844      0.073844              1.504667                                cpu=13533,mem=63760930,energy+
+ class                                   1    0.028571   255617290    0.003733      0.003733              7.652841                                cpu=7021,mem=74554606,energy=+
-  gamma                                   1    0.111111   819825375    0.061416      0.061416              1.809155                                cpu=250117,mem=1128084138,ene+
+ clip                                    1    0.028571  3057933838    0.044674      0.044674              0.639549                                cpu=33214,mem=2744443460,ener+
-  mc2                                     1    0.111111          11    0.000000      0.000000            1.2606e+08                                cpu=0,mem=0,energy=0,node=0,b+
+ cml-abhinav                             1    0.028571    39726437    0.000580      0.000580             49.220844                                cpu=0,mem=0,energy=0,node=0,b+
-  nexus                                   1    0.111111  2632111243    0.197035      0.197035              0.563914                                cpu=170772,mem=2035642767,ene+
+ cml-director                            1    0.028571   672692955    0.009829      0.009829              2.906778                                cpu=65977,mem=1080976998,ener+
-   nexus                username          1    0.000829         308    0.000000      0.000000   0.470629 7.0587e+03                                cpu=0,mem=0,energy=0,node=0,b+
+ cml-furongh                             1    0.028571   765366431    0.011183      0.011183              2.554814                                cpu=78802,mem=1513011200,ener+
-  scavenger                               1    0.111111  8821718910    0.660346      0.660346              0.168262                                cpu=559683,mem=5402754321,ene+
+ cml-hajiagha                            1    0.028571    12397205    0.000181      0.000181            157.726572                                cpu=0,mem=0,energy=0,node=0,b+
-   scavenger            username          1    0.000829           0    0.000000      0.000000   0.419187        inf                                cpu=0,mem=0,energy=0,node=0,b+
+ cml-ramani                              1    0.028571           0    0.000000      0.000000            1.3575e+11                                cpu=0,mem=0,energy=0,node=0,b+
-  staff                                   1    0.111111           0    0.000000      0.000000                   inf                                cpu=0,mem=0,energy=0,node=0,b+
+ cml-scavenger                           1    0.028571  2564733604    0.037475      0.037475              0.762406                                cpu=370867,mem=3568986794,ene+
+ cml-sfeizi                              1    0.028571    40715178    0.000595      0.000595             48.025548                                cpu=0,mem=0,energy=0,node=0,b+
+  cml-tokekar                             1    0.028571    26458249    0.000387      0.000387             73.903937                                cpu=23240,mem=178488320,energ+
+ cml-tomg                                1    0.028571       82800    0.000001      0.000001            2.3615e+04                                cpu=0,mem=0,energy=0,node=0,b+
+  cml-wriva-high                          1    0.028571           0    0.000000      0.000000                   inf                                cpu=0,mem=0,energy=0,node=0,b+
+ cml-wriva                               1    0.028571  1047428702    0.015305      0.015305              1.866828                                cpu=5366,mem=137386666,energy+
+ cml                                     1    0.028571    66866114    0.000975      0.000975             29.299389                                cpu=1796,mem=29426756,energy=+
+ csd-sarahwie                            1    0.028571   710055035    0.007034      0.007034              4.061956                                cpu=0,mem=0,energy=0,node=0,b+
+  gamma                                   1    0.028571  2609474948    0.038129      0.038129              0.749334                                cpu=34089,mem=360373862,energ+
+ mbrc                                    1    0.028571    73411964    0.001073      0.001073             26.635560                                cpu=1195,mem=4896358,energy=0+
+  mc2                                     1    0.028571     2682557    0.000039      0.000039            728.919551                                cpu=0,mem=0,energy=0,node=0,b+
+  nexus                                   1    0.028571  5472794067    0.079964      0.079964              0.357302                                cpu=278464,mem=3250599000,ene+
+   nexus                username          1    0.000835       69666    0.000001      0.000021   0.457407  37.435501                                cpu=0,mem=0,energy=0,node=0,b+
+ oasis                                   1    0.028571      330030    0.000005      0.000005            5.9248e+03                                cpu=0,mem=0,energy=0,node=0,b+
+ quics                                   1    0.028571           4    0.000000      0.000000            4.1683e+08                                cpu=0,mem=0,energy=0,node=0,b+
+  scavenger                               1    0.028571 40888195964    0.597419      0.597419              0.047825                                cpu=3142204,mem=29902903931,e+
+   scavenger            username          1    0.000835         171    0.000000      0.000000   0.033975 9.8885e+04                                cpu=0,mem=0,energy=0,node=0,b+
+ vulcan-abhinav                          1    0.028571  1742123078    0.025452      0.025452              1.122544                                cpu=26166,mem=781842841,energ+
+ vulcan-djacobs                          1    0.028571      912637    0.000013      0.000013            2.1425e+03                                cpu=0,mem=0,energy=0,node=0,b+
+ vulcan-jbhuang                          1    0.028571   283430750    0.004141      0.004141              6.898930                                cpu=172,mem=5661764,energy=0,+
+ vulcan-metzler                          1    0.028571   262070179    0.003829      0.003829              7.461241                                cpu=0,mem=0,energy=0,node=0,b+
+  vulcan-rama                             1    0.028571           0    0.000000      0.000000            1.6942e+14                                cpu=0,mem=0,energy=0,node=0,b+
+ vulcan-ramani                           1    0.028571   681333557    0.009955      0.009955              2.869914                                cpu=22188,mem=568033280,energ+
+ vulcan-yaser                            1    0.028571      397309    0.000006      0.000006            4.9215e+03                                cpu=0,mem=0,energy=0,node=0,b+
+ vulcan-zwicker                          1    0.028571   133204932    0.001946      0.001946             14.679402                                cpu=0,mem=0,energy=0,node=0,b+
+ vulcan                                  1    0.028571  1247236491    0.018224      0.018224              1.567761                                cpu=147273,mem=1161243818,ene+
 </pre>
-The actual resource weightings for the three main resources (memory per GB, CPU cores, and GPUs if applicable) are per-partition and can be viewed in the <code>TRESBillingWeights</code> line in the output of <code>scontrol show partition</code>. The <code>billing</code> value for a job is the sum of all resource weightings for resources the job has requested. This value is then multiplied by the amount of time a job has run in seconds to get the amount it contributes to the RawUsage for the association within the account it is running under.
+The actual resource billing weights for the three main resources (memory per GB, CPU cores, and number of GPUs if applicable) are per-partition and can be viewed in the <code>TRESBillingWeights</code> line in the output of <code>scontrol show partition</code>. The <code>billing</code> value for a job is the sum of all resource weightings for resources the job has requested. This value is then multiplied by the amount of time a job has run in seconds to get the amount it contributes to the RawUsage for the association within the account it is running under.
-There are two main algorithms we use for resource weightings, per cluster:
-===Modern===
-This weighting algorithm is soon to be in use on the following clusters:
-* [[CML]] (after 2/23/2023)
-* [[Nexus]] (after 2/23/2023)
-Resource have algorithmically computed floating point billing values.
-====GPU-capable partitions====
+====Algorithm====
-Each resource (memory/CPU/GPU) is given a weighting value such that their relative billings to each other are equal (33.33% each). The values are then rounded to whole numbers. Memory is typically always the most abundant resource (weighting value of 1.0) and the CPU/GPU values are adjusted accordingly.
+The algorithm we use for resource weightings differs depending on if there are any GPUs in a partition or not, and is as follows:
-Different GPU types may also be weighted differently within the GPU relative billing. A baseline GPU type is first chosen for each cluster. All GPUs of that type and other types that have lower FP32 performance (in [https://en.wikipedia.org/wiki/FLOPS TFLOPS], rounded to one decimal place) are given a weighting factor of 1.0. GPU types with higher FP32 performance than the baseline GPU are given a weighting factor calculated by dividing their FP32 performance by the baseline GPU's performance, rounded to two decimal places (i.e. as a percentage). The weighting values for each GPU type are then determined by normalizing the sum of all of GPU cards of different types multiplied by their weighting factors against the relative billing percentage. The values are then rounded to whole numbers.
+=====GPU partitions=====
+Each resource (memory/CPU/GPU) is given a weighting value such that their relative billings to each other within the partition are equal (33.33% each). Memory is typically always the most abundant resource by unit (weighting value of 1.0 per GB) and the CPU/GPU values are adjusted accordingly.
-The current baseline GPUs per cluster are:
+Different GPU types may also be weighted differently within the GPU relative billing. A baseline GPU type is first chosen. All GPUs of that type and other types that have lower FP32 performance (in [https://en.wikipedia.org/wiki/FLOPS TFLOPS]) are given a weighting factor of 1.0. GPU types with higher FP32 performance than the baseline GPU are given a weighting factor calculated by dividing their FP32 performance by the baseline GPU's FP32 performance. The weighting values for each GPU type are then determined by normalizing the sum of all of GPU cards' billing values multiplied by their weighting factors against the relative billing percentage for GPUs (33.33%).
-* CML (after 2/23/2023): NVIDIA RTX A4000
-* Nexus (after 2/23/2023): NVIDIA RTX A4000
-====CPU-only partitions====
+The current baseline GPU is the [https://www.nvidia.com/en-us/design-visualization/rtx-a4000/ NVIDIA RTX A4000].
-Each resource (memory/CPU) is first given a weighting value such that their relative billings to each other are equal (50% each). The values are then rounded to whole numbers. Memory is typically always the most abundant resource (weighting value of 1.0) and the CPU value is adjusted accordingly. The final CPU weight value is then divided by 10, which ends up translating to roughly 90.9% of the billing weight being for memory and 9.1% being for CPU. This is done so as to not affect accounts' fair-share priority factors as much when running CPU-only jobs given the popularity of GPU computing.
-===Legacy===
+=====CPU-only partitions=====
-This weighting algorithm is currently in use on all clusters not mentioned in the previous section. These clusters will eventually either fold into [[Nexus]] or have the modern algorithm introduced in the future.
+Each resource (memory/CPU) is first given a weighting value such that their relative billings to each other within the partition are equal (50% each). Memory is typically always the most abundant resource by unit (weighting value of 1.0 per GB) and the CPU value is adjusted accordingly. The final CPU weight value is then divided by 10, which ends up translating to roughly 90.9% of the billing weight being for memory and 9.1% being for CPU. The division of the CPU value is done so as to not affect accounts' fair-share priority factors as much when running CPU-only jobs given the popularity of GPGPU computing.
-Resources have fixed floating point billing values.
+===Age===
+The longer a job is eligible to run but cannot due to resources being unavailable or having a lower priority value than one or more other jobs, the higher the job's priority becomes as it continue to wait in the queue. This is the only priority modifier that can change a job's priority value once it has been submitted, and the priority modifier for this factor reaches its limit after 7 days.
-====GPU-capable partitions====
+===Association===
-Memory is billed at 0.125 per GB, CPU is billed at 1.0 per core, and GPU is billed at 4.0 per card.
+Some lab/center-specific SLURM accounts may have priority values directly attached to them. Jobs run under these accounts gain this many extra points of priority.
-====CPU-only partitions====
+===Nice value===
-Memory is billed at 0.125 per GB and CPU is billed at 0.1 per core. The lower CPU weighting is done so as to not affect accounts' fair-share priority factors as much when running CPU-only jobs given the popularity of GPU computing.
+This is a submission argument that you as the user can include when submitting your jobs to deprioritize them. Larger values will deprioritize jobs more, e.g.,
+<pre>srun --pty --nice=2 bash</pre>
-==Nice value==
-This is a submission argument that you as the user can include when submitting your jobs to deprioritize them. Larger values will deprioritize jobs e.g.,
-<pre>srun --pty --qos=default --mem 1gb --time=01:00:00 --nice=2 bash</pre>
 will have lower priority than
-<pre>srun --pty --qos=default --mem 1gb --time=01:00:00 --nice=1 bash</pre>
+<pre>srun --pty --nice=1 bash</pre>
 which will have lower priority than
-<pre>srun --pty --qos=default --mem 1gb --time=01:00:00 bash</pre>
+<pre>srun --pty bash</pre>
 assuming all three jobs were submitted at the same time. You cannot use negative values for this argument.
+Because this value is absolute, if you want to use it, we would recommend only using small numbers - one or two digits only. Larger numbers may impact your job's ability to run at all based on the other factors at play.

SLURM/Priority: Difference between revisions

Latest revision as of 16:30, 25 November 2025

Contents

Pending Jobs

Priority Factors

Partition

Fair-share

Algorithm

GPU partitions

CPU-only partitions

Age

Association

Nice value

Navigation menu

SLURM/Priority: Difference between revisions

Latest revision as of 16:30, 25 November 2025

Pending Jobs

Priority Factors

Partition

Fair-share

Algorithm

GPU partitions

CPU-only partitions

Age

Association

Nice value

Navigation menu

Search