SLURM/JobStatus: Difference between revisions
(→sacct) |
|||
(23 intermediate revisions by 2 users not shown) | |||
Line 4: | Line 4: | ||
==squeue== | ==squeue== | ||
The squeue command shows job status in the queue. Helpful flags: | The squeue command shows job status in the queue. Helpful flags: | ||
* <code>-u username</code> to show only your jobs (replace username with your UMIACS username) | * <code>-u username</code> to show only your jobs (replace <tt>username</tt> with your UMIACS username) | ||
* <code>--start</code> to estimate start time for a job that has not yet started and the reason why it is waiting | * <code>--start</code> to estimate start time for a job that has not yet started and the reason why it is waiting | ||
* <code>-s</code> to show the status of individual job steps for a job (e.g. batch jobs) | * <code>-s</code> to show the status of individual job steps for a job (e.g. batch jobs) | ||
Line 10: | Line 10: | ||
Examples: | Examples: | ||
<pre> | <pre> | ||
username@nexusclip00 | [username@nexusclip00 ~]$ squeue -u username | ||
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) | JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) | ||
162 | 162 tron helloWor username R 0:03 2 tron[00-01] | ||
</pre> | </pre> | ||
<pre> | <pre> | ||
username@nexusclip00 | [username@nexusclip00 ~]$ squeue --start -u username | ||
JOBID PARTITION NAME USER ST START_TIME NODES SCHEDNODES NODELIST(REASON) | JOBID PARTITION NAME USER ST START_TIME NODES SCHEDNODES NODELIST(REASON) | ||
163 | 163 tron helloWo2 username PD 2020-05-11T18:36:49 1 tron02 (Priority) | ||
</pre> | </pre> | ||
<pre> | <pre> | ||
username@nexusclip00 | [username@nexusclip00 ~]$ squeue -s -u username | ||
STEPID NAME PARTITION USER TIME NODELIST | STEPID NAME PARTITION USER TIME NODELIST | ||
162.0 sleep | 162.0 sleep tron username 0:05 tron00 | ||
162.1 sleep | 162.1 sleep tron username 0:05 tron01 | ||
</pre> | </pre> | ||
Line 34: | Line 34: | ||
</pre> | </pre> | ||
<pre> | <pre> | ||
username@nexusclip00 | [username@nexusclip00 ~]$ sstat --format JobID,NTasks,nodelist,MaxRSS,MaxVMSize,AveRSS,AveVMSize 171 | ||
JobID NTasks Nodelist MaxRSS MaxVMSize AveRSS AveVMSize | JobID NTasks Nodelist MaxRSS MaxVMSize AveRSS AveVMSize | ||
------------ -------- -------------------- ---------- ---------- ---------- ---------- | ------------ -------- -------------------- ---------- ---------- ---------- ---------- | ||
171.0 1 | 171.0 1 tron00 0 186060K 0 107900K | ||
username@nexusclip00 | [username@nexusclip00 ~]$ sstat --format JobID,NTasks,nodelist,MaxRSS,MaxVMSize,AveRSS,AveVMSize 171.1 | ||
JobID NTasks Nodelist MaxRSS MaxVMSize AveRSS AveVMSize | JobID NTasks Nodelist MaxRSS MaxVMSize AveRSS AveVMSize | ||
------------ -------- -------------------- ---------- ---------- ---------- ---------- | ------------ -------- -------------------- ---------- ---------- ---------- ---------- | ||
171.1 1 | 171.1 1 tron01 0 186060K 0 107900K | ||
</pre> | </pre> | ||
Note that if you do not have any jobsteps, sstat will return an error. | Note that if you do not have any jobsteps, sstat will return an error. | ||
<pre> | <pre> | ||
username@nexusclip00 | [username@nexusclip00 ~]$ sstat --format JobID,NTasks,nodelist,MaxRSS,MaxVMSize,AveRSS,AveVMSize 172 | ||
JobID NTasks Nodelist MaxRSS MaxVMSize AveRSS AveVMSize | JobID NTasks Nodelist MaxRSS MaxVMSize AveRSS AveVMSize | ||
------------ -------- -------------------- ---------- ---------- ---------- ---------- | ------------ -------- -------------------- ---------- ---------- ---------- ---------- | ||
Line 66: | Line 66: | ||
The sacct command shows metrics from past jobs. | The sacct command shows metrics from past jobs. | ||
<pre> | <pre> | ||
username@nexusclip00 | [username@nexusclip00 ~]$ sacct | ||
JobID JobName Partition Account AllocCPUS State ExitCode | JobID JobName Partition Account AllocCPUS State ExitCode | ||
------------ ---------- ---------- ---------- ---------- ---------- -------- | ------------ ---------- ---------- ---------- ---------- ---------- -------- | ||
162 helloWorld | 162 helloWorld tron nexus 2 COMPLETED 0:0 | ||
162.batch batch | 162.batch batch nexus 1 COMPLETED 0:0 | ||
162.0 sleep | 162.0 sleep nexus 1 COMPLETED 0:0 | ||
162.1 sleep | 162.1 sleep nexus 1 COMPLETED 0:0 | ||
163 helloWorld | 163 helloWorld tron nexus 2 COMPLETED 0:0 | ||
163.batch batch | 163.batch batch nexus 1 COMPLETED 0:0 | ||
163.0 sleep | 163.0 sleep nexus 1 COMPLETED 0:0 | ||
</pre> | </pre> | ||
To check one specific job, you can run something like the following (if you omit .<$JOBSTEP>, all jobsteps will be shown): | To check one specific job, you can run something like the following (if you omit .<$JOBSTEP>, all jobsteps will be shown): | ||
<pre>sacct --format JobID,jobname,NTasks,nodelist,MaxRSS,MaxVMSize,AveRSS,AveVMSize,Elapsed -j <$JOBID>.<$JOBSTEP></pre> | <pre>sacct --format JobID,jobname,NTasks,nodelist,MaxRSS,MaxVMSize,AveRSS,AveVMSize,Elapsed -j <$JOBID>.<$JOBSTEP></pre> | ||
<pre> | <pre> | ||
username@nexusclip00 | [username@nexusclip00 ~]$ sacct --format JobID,jobname,NTasks,nodelist,MaxRSS,MaxVMSize,AveRSS,AveVMSize,Elapsed -j 171 | ||
JobID JobName NTasks NodeList MaxRSS MaxVMSize AveRSS AveVMSize Elapsed | JobID JobName NTasks NodeList MaxRSS MaxVMSize AveRSS AveVMSize Elapsed | ||
------------ ---------- -------- --------------- ---------- ---------- ---------- ---------- ---------- | ------------ ---------- -------- --------------- ---------- ---------- ---------- ---------- ---------- | ||
171 helloWorld | 171 helloWorld tron[00-01] 00:00:30 | ||
171.batch batch 1 | 171.batch batch 1 tron00 0 119784K 0 113120K 00:00:30 | ||
171.0 sleep 1 | 171.0 sleep 1 tron00 0 186060K 0 107900K 00:00:30 | ||
171.1 sleep 1 | 171.1 sleep 1 tron01 0 186060K 0 107900K 00:00:30 | ||
</pre> | </pre> | ||
=Job Codes= | =Job Codes= | ||
If you list the current running jobs and your job is in <code>PD</code> (Pending), SLURM will provide you some information on what the reason for this in the NODELIST parameter. You can use <code>scontrol show job <jobid></code> to get all the parameters for your job to help identify why your job is not running. | |||
<pre> | <pre> | ||
[username@nexusclip00 ~]$ squeue -u username | |||
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) | JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) | ||
1 tron bash username PD 0:00 1 (AssocGrpGRES) | |||
2 tron bash username PD 0:00 1 (Resources) | |||
3 tron bash username PD 0:00 1 (Priority) | |||
4 tron bash username PD 0:00 1 (QOSMaxGRESPerUser) | |||
5 tron bash username PD 0:00 1 (ReqNodeNotAvail, Reserved for maintenance) | |||
</pre> | </pre> | ||
Some common ones are as follows: | Some common ones are as follows: | ||
* <code>Resources</code> - The cluster does not currently have the resources to fit your job. | * <code>Resources</code> - The cluster does not currently have the resources to fit your job in your selected partition. | ||
* <code> | * <code>Priority</code> - The cluster has reserved resources for higher [[SLURM/Priority | priority]] jobs in your selected partition. | ||
* <code> | * <code>QOSMax*PerUser</code> or <code>QOSMax*PerUserLimit</code> - The quality of service (QoS) your job is requesting to use has some limit per user (CPU, mem, GRES, etc.). Use <code>show_qos</code> and <code>show_partition_qos</code> to identify the limit(s) and then use <code>scontrol show job <jobid></code> for each of your jobs running in that QoS to see the resources they are currently consuming. | ||
* <code>ReqNodeNotAvail</code> - | * <code>AssocGrpBilling</code> - The SLURM account you are using has a limit on the overall billing amount available in total for the account. Use <code>sacctmgr show assoc account=<accountname> where user=</code> to identify the limit, replacing <tt><accountname></tt> with the account you are submitting your job with. You can see all jobs running under the account and their billing values by running <code>squeue -A <accountname> -O "JobId:.18 ,Partition:.9 ,Name:.8 ,UserName:.8 ,StateCompact:.2 ,TimeUsed:.10 ,NumNodes:.6 ,ReasonList:45 ,tres-alloc:80"</code>. The billing value will be part of the <tt>tres-alloc</tt> string for each job. | ||
* <code>ReqNodeNotAvail</code> - None of the nodes that could run your job (based on requested partition/resources) currently have the resources to fit your job. Alternatively, if you also see <code>Reserved for maintenance</code>, there is a reservation in place (often for a [[MonthlyMaintenanceWindow | maintenance window]]). You can see the current reservations by running <code>scontrol show reservation</code>. Often the culprit is that you have requested a TimeLimit that will conflict with the reservation. You can either lower your TimeLimit such that the job will complete before the reservation begins, or leave your job to wait until the reservation completes. | |||
SLURM's full list of reasons/explanations can be found [https://slurm.schedmd.com/job_reason_codes.html here]. |
Latest revision as of 19:12, 22 August 2024
Job Status
SLURM offers a variety of tools to check the status of your jobs before, during, and after execution. When you first submit your job, SLURM should give you a job ID which represents the resources allocated to your job. Individual calls to srun will spawn job steps which can also be queried individually.
squeue
The squeue command shows job status in the queue. Helpful flags:
-u username
to show only your jobs (replace username with your UMIACS username)--start
to estimate start time for a job that has not yet started and the reason why it is waiting-s
to show the status of individual job steps for a job (e.g. batch jobs)
Examples:
[username@nexusclip00 ~]$ squeue -u username JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 162 tron helloWor username R 0:03 2 tron[00-01]
[username@nexusclip00 ~]$ squeue --start -u username JOBID PARTITION NAME USER ST START_TIME NODES SCHEDNODES NODELIST(REASON) 163 tron helloWo2 username PD 2020-05-11T18:36:49 1 tron02 (Priority)
[username@nexusclip00 ~]$ squeue -s -u username STEPID NAME PARTITION USER TIME NODELIST 162.0 sleep tron username 0:05 tron00 162.1 sleep tron username 0:05 tron01
sstat
The sstat command shows metrics from currently running job steps. If you don't specify a job step, the lowest job step is displayed.
sstat --format JobID,NTasks,nodelist,MaxRSS,MaxVMSize,AveRSS,AveVMSize <$JOBID>.<$JOBSTEP>
[username@nexusclip00 ~]$ sstat --format JobID,NTasks,nodelist,MaxRSS,MaxVMSize,AveRSS,AveVMSize 171 JobID NTasks Nodelist MaxRSS MaxVMSize AveRSS AveVMSize ------------ -------- -------------------- ---------- ---------- ---------- ---------- 171.0 1 tron00 0 186060K 0 107900K [username@nexusclip00 ~]$ sstat --format JobID,NTasks,nodelist,MaxRSS,MaxVMSize,AveRSS,AveVMSize 171.1 JobID NTasks Nodelist MaxRSS MaxVMSize AveRSS AveVMSize ------------ -------- -------------------- ---------- ---------- ---------- ---------- 171.1 1 tron01 0 186060K 0 107900K
Note that if you do not have any jobsteps, sstat will return an error.
[username@nexusclip00 ~]$ sstat --format JobID,NTasks,nodelist,MaxRSS,MaxVMSize,AveRSS,AveVMSize 172 JobID NTasks Nodelist MaxRSS MaxVMSize AveRSS AveVMSize ------------ -------- -------------------- ---------- ---------- ---------- ---------- sstat: error: no steps running for job 237
If you do not run any srun commands, you will not create any job steps and metrics will not be available for your job. Your batch scripts should follow this format:
#!/bin/bash #SBATCH ... #SBATCH ... # set environment up module load ... # launch job steps srun <command to run> # that would be step 1 srun <command to run> # that would be step 2
sacct
The sacct command shows metrics from past jobs.
[username@nexusclip00 ~]$ sacct JobID JobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ---------- -------- 162 helloWorld tron nexus 2 COMPLETED 0:0 162.batch batch nexus 1 COMPLETED 0:0 162.0 sleep nexus 1 COMPLETED 0:0 162.1 sleep nexus 1 COMPLETED 0:0 163 helloWorld tron nexus 2 COMPLETED 0:0 163.batch batch nexus 1 COMPLETED 0:0 163.0 sleep nexus 1 COMPLETED 0:0
To check one specific job, you can run something like the following (if you omit .<$JOBSTEP>, all jobsteps will be shown):
sacct --format JobID,jobname,NTasks,nodelist,MaxRSS,MaxVMSize,AveRSS,AveVMSize,Elapsed -j <$JOBID>.<$JOBSTEP>
[username@nexusclip00 ~]$ sacct --format JobID,jobname,NTasks,nodelist,MaxRSS,MaxVMSize,AveRSS,AveVMSize,Elapsed -j 171 JobID JobName NTasks NodeList MaxRSS MaxVMSize AveRSS AveVMSize Elapsed ------------ ---------- -------- --------------- ---------- ---------- ---------- ---------- ---------- 171 helloWorld tron[00-01] 00:00:30 171.batch batch 1 tron00 0 119784K 0 113120K 00:00:30 171.0 sleep 1 tron00 0 186060K 0 107900K 00:00:30 171.1 sleep 1 tron01 0 186060K 0 107900K 00:00:30
Job Codes
If you list the current running jobs and your job is in PD
(Pending), SLURM will provide you some information on what the reason for this in the NODELIST parameter. You can use scontrol show job <jobid>
to get all the parameters for your job to help identify why your job is not running.
[username@nexusclip00 ~]$ squeue -u username JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1 tron bash username PD 0:00 1 (AssocGrpGRES) 2 tron bash username PD 0:00 1 (Resources) 3 tron bash username PD 0:00 1 (Priority) 4 tron bash username PD 0:00 1 (QOSMaxGRESPerUser) 5 tron bash username PD 0:00 1 (ReqNodeNotAvail, Reserved for maintenance)
Some common ones are as follows:
Resources
- The cluster does not currently have the resources to fit your job in your selected partition.Priority
- The cluster has reserved resources for higher priority jobs in your selected partition.QOSMax*PerUser
orQOSMax*PerUserLimit
- The quality of service (QoS) your job is requesting to use has some limit per user (CPU, mem, GRES, etc.). Useshow_qos
andshow_partition_qos
to identify the limit(s) and then usescontrol show job <jobid>
for each of your jobs running in that QoS to see the resources they are currently consuming.AssocGrpBilling
- The SLURM account you are using has a limit on the overall billing amount available in total for the account. Usesacctmgr show assoc account=<accountname> where user=
to identify the limit, replacing <accountname> with the account you are submitting your job with. You can see all jobs running under the account and their billing values by runningsqueue -A <accountname> -O "JobId:.18 ,Partition:.9 ,Name:.8 ,UserName:.8 ,StateCompact:.2 ,TimeUsed:.10 ,NumNodes:.6 ,ReasonList:45 ,tres-alloc:80"
. The billing value will be part of the tres-alloc string for each job.ReqNodeNotAvail
- None of the nodes that could run your job (based on requested partition/resources) currently have the resources to fit your job. Alternatively, if you also seeReserved for maintenance
, there is a reservation in place (often for a maintenance window). You can see the current reservations by runningscontrol show reservation
. Often the culprit is that you have requested a TimeLimit that will conflict with the reservation. You can either lower your TimeLimit such that the job will complete before the reservation begins, or leave your job to wait until the reservation completes.
SLURM's full list of reasons/explanations can be found here.