SLURM: Difference between revisions

From UMIACS
Jump to navigation Jump to search
No edit summary
No edit summary
Line 5: Line 5:
Terminology and command line changes are the biggest differences when coming from Torque/Maui to Slurm.
Terminology and command line changes are the biggest differences when coming from Torque/Maui to Slurm.


# Queues are Partitions
* Torque queues are now called partitions in Slurm


=Commands=
=Commands=

Revision as of 01:12, 30 July 2015

Simple Linux Utility for Resource Management

UMIACS is transitioning away from our Torque/Maui batch resource manager to Slurm. Slurm is now in use broadly with the regional and national super computing communities.

Terminology and command line changes are the biggest differences when coming from Torque/Maui to Slurm.

  • Torque queues are now called partitions in Slurm

Commands

sinfo

To view partitions and nodes you can use the ```sinfo``` command.

# sinfo
PARTITION AVAIL  TIMELIMIT NODES  STATE NODELIST
debug*       up      30:00     2  down* adev[1-2]
debug*       up      30:00     3   idle adev[3-5]
batch        up      30:00     3  down* adev[6,13,15]
batch        up      30:00     3  alloc adev[7-8,14]
batch        up      30:00     4   idle adev[9-12]

squeue

To show jobs in partitions the ```squeue``` command is used. This will by default will show all jobs in all partitions. You can restrict the

# squeue
JOBID PARTITION  NAME  USER ST  TIME NODES NODELIST(REASON)
65646     batch  chem  mike  R 24:19     2 adev[7-8]
65647     batch   bio  joan  R  0:09     1 adev14
65648     batch  math  phil PD  0:00     6 (Resources)

srun

scancel

scontrol

You can receive more thorough information on both nodes and partitions through the scontrol command.

To show more about partitions you can run scontrol show partition

# scontrol show partition
PartitionName=debug TotalNodes=5 TotalCPUs=40 RootOnly=NO
   Default=YES Shared=FORCE:4 Priority=1 State=UP
   MaxTime=00:30:00 Hidden=NO
   MinNodes=1 MaxNodes=26 DisableRootJobs=NO AllowGroups=ALL
   Nodes=adev[1-5] NodeIndices=0-4

PartitionName=batch TotalNodes=10 TotalCPUs=80 RootOnly=NO
   Default=NO Shared=FORCE:4 Priority=1 State=UP
   MaxTime=16:00:00 Hidden=NO
   MinNodes=1 MaxNodes=26 DisableRootJobs=NO AllowGroups=ALL
   Nodes=adev[6-15] NodeIndices=5-14

To show more about nodes you can run scontrol show nodes