SLURM/ClusterStatus

From UMIACS
Revision as of 13:14, 11 July 2016 by Tgray26 (talk | contribs) (Created page with "=Cluster Status= The general status of nodes/partitions in a cluster can be viewed using the sinfo and scontrol commands. ==sinfo== sinfo will show you the status of partition...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Cluster Status

The general status of nodes/partitions in a cluster can be viewed using the sinfo and scontrol commands.

sinfo

sinfo will show you the status of partitions in the cluster. Passing the -N flag will show each node individually.

tgray26@shadosub:sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
test         up   infinite      2    mix shado[00-01]
test         up   infinite      7   idle shado[02-08]
test2*       up   infinite      2    mix shado[00-01]
test2*       up   infinite      3   idle shado[02-04]
tgray26@shadosub:sinfo -N
NODELIST   NODES PARTITION STATE 
shado00        1      test mix   
shado00        1    test2* mix   
shado01        1      test mix   
shado01        1    test2* mix   
shado02        1      test idle  
shado02        1    test2* idle 

scontrol

The scontrol command, while generally reserved for administrator use, can be used to view the status/configuration of the nodes in the cluster. If passed a specific node name only information about that node will be displayed, otherwise all nodes will be listed.

tgray26@shadosub:scontrol show nodes shado00
NodeName=shado00 Arch=x86_64 CoresPerSocket=4
   CPUAlloc=0 CPUErr=0 CPUTot=8 CPULoad=0.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=shado00 NodeHostName=shado00 Version=16.05
   OS=Linux RealMemory=15885 AllocMem=0 FreeMem=12187 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=49975 Weight=1 Owner=N/A MCS_label=N/A
   BootTime=2016-06-23T20:25:41 SlurmdStartTime=2016-07-10T13:33:29
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s