SLURM: Difference between revisions
No edit summary |
|||
Line 11: | Line 11: | ||
==sinfo== | ==sinfo== | ||
To view partitions and nodes you can use the | To view partitions and nodes you can use the '''sinfo''' command. | ||
<pre> | <pre> | ||
Line 25: | Line 25: | ||
==squeue== | ==squeue== | ||
To show jobs in partitions the | To show jobs in partitions the '''squeue''' command is used. This will by default will show all jobs in all partitions. You can restrict the | ||
<pre> | <pre> |
Revision as of 01:24, 30 July 2015
Simple Linux Utility for Resource Management
UMIACS is transitioning away from our Torque/Maui batch resource manager to Slurm. Slurm is now in use broadly with the regional and national super computing communities.
Terminology and command line changes are the biggest differences when coming from Torque/Maui to Slurm.
- Torque queues are now called partitions in Slurm
Commands
sinfo
To view partitions and nodes you can use the sinfo command.
# sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST debug* up 30:00 2 down* adev[1-2] debug* up 30:00 3 idle adev[3-5] batch up 30:00 3 down* adev[6,13,15] batch up 30:00 3 alloc adev[7-8,14] batch up 30:00 4 idle adev[9-12]
squeue
To show jobs in partitions the squeue command is used. This will by default will show all jobs in all partitions. You can restrict the
# squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 65646 batch chem mike R 24:19 2 adev[7-8] 65647 batch bio joan R 0:09 1 adev14 65648 batch math phil PD 0:00 6 (Resources)
srun
To run a simple command like hostname over 4 nodes: srun -n4 -l hostname
scancel
scontrol
You can receive more thorough information on both nodes and partitions through the scontrol command.
To show more about partitions you can run scontrol show partition
# scontrol show partition PartitionName=debug TotalNodes=5 TotalCPUs=40 RootOnly=NO Default=YES Shared=FORCE:4 Priority=1 State=UP MaxTime=00:30:00 Hidden=NO MinNodes=1 MaxNodes=26 DisableRootJobs=NO AllowGroups=ALL Nodes=adev[1-5] NodeIndices=0-4 PartitionName=batch TotalNodes=10 TotalCPUs=80 RootOnly=NO Default=NO Shared=FORCE:4 Priority=1 State=UP MaxTime=16:00:00 Hidden=NO MinNodes=1 MaxNodes=26 DisableRootJobs=NO AllowGroups=ALL Nodes=adev[6-15] NodeIndices=5-14
To show more about nodes you can run scontrol show nodes