SLURM/ClusterStatus: Difference between revisions
Jump to navigation
Jump to search
(Created page with "=Cluster Status= The general status of nodes/partitions in a cluster can be viewed using the sinfo and scontrol commands. ==sinfo== sinfo will show you the status of partition...") |
No edit summary |
||
Line 4: | Line 4: | ||
sinfo will show you the status of partitions in the cluster. Passing the -N flag will show each node individually. | sinfo will show you the status of partitions in the cluster. Passing the -N flag will show each node individually. | ||
<pre> | <pre> | ||
tgray26@ | tgray26@opensub00:sinfo | ||
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST | PARTITION AVAIL TIMELIMIT NODES STATE NODELIST | ||
dpart* up infinite 8 idle openlab[00-07] | |||
gpu up infinite 2 idle openlab[08-09] | |||
</pre> | </pre> | ||
<pre> | <pre> | ||
tgray26@ | tgray26@opensub00:sinfo -N | ||
NODELIST NODES PARTITION STATE | NODELIST NODES PARTITION STATE | ||
openlab00 1 dpart* idle | |||
openlab01 1 dpart* idle | |||
openlab02 1 dpart* idle | |||
openlab03 1 dpart* idle | |||
openlab04 1 dpart* idle | |||
openlab05 1 dpart* idle | |||
openlab06 1 dpart* idle | |||
openlab07 1 dpart* idle | |||
openlab08 1 gpu idle | |||
openlab09 1 gpu idle | |||
</pre> | </pre> | ||
==scontrol== | ==scontrol== | ||
The scontrol command, while generally reserved for administrator use, can be used to view the status/configuration of the nodes in the cluster. If passed | The scontrol command, while generally reserved for administrator use, can be used to view the status/configuration of the nodes in the cluster. If passed specific node name(s) only information about those node(s) will be displayed, otherwise all nodes will be listed. To specify multiple nodes, separate each node name by a comma (no spaces). | ||
<pre> | <pre> | ||
tgray26@ | tgray26@opensub00:scontrol show nodes openlab00,openlab01 | ||
NodeName= | NodeName=openlab00 Arch=x86_64 CoresPerSocket=4 | ||
CPUAlloc=0 CPUErr=0 CPUTot=8 CPULoad=0.02 | |||
AvailableFeatures=(null) | |||
ActiveFeatures=(null) | |||
Gres=(null) | |||
NodeAddr=openlab00 NodeHostName=openlab00 Version=16.05 | |||
OS=Linux RealMemory=7822 AllocMem=0 FreeMem=5842 Sockets=2 Boards=1 | |||
State=IDLE ThreadsPerCore=1 TmpDisk=49975 Weight=1 Owner=N/A MCS_label=N/A | |||
BootTime=2016-07-11T16:40:45 SlurmdStartTime=2016-07-11T23:47:24 | |||
CapWatts=n/a | |||
CurrentWatts=0 LowestJoules=0 ConsumedJoules=0 | |||
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s | |||
NodeName=openlab01 Arch=x86_64 CoresPerSocket=4 | |||
CPUAlloc=0 CPUErr=0 CPUTot=8 CPULoad=0.01 | CPUAlloc=0 CPUErr=0 CPUTot=8 CPULoad=0.01 | ||
AvailableFeatures=(null) | AvailableFeatures=(null) | ||
ActiveFeatures=(null) | ActiveFeatures=(null) | ||
Gres=(null) | Gres=(null) | ||
NodeAddr= | NodeAddr=openlab01 NodeHostName=openlab01 Version=16.05 | ||
OS=Linux RealMemory= | OS=Linux RealMemory=7822 AllocMem=0 FreeMem=5865 Sockets=2 Boards=1 | ||
State=IDLE ThreadsPerCore=1 TmpDisk=49975 Weight=1 Owner=N/A MCS_label=N/A | State=IDLE ThreadsPerCore=1 TmpDisk=49975 Weight=1 Owner=N/A MCS_label=N/A | ||
BootTime=2016- | BootTime=2016-07-11T16:40:59 SlurmdStartTime=2016-07-11T23:48:25 | ||
CapWatts=n/a | CapWatts=n/a | ||
CurrentWatts=0 LowestJoules=0 ConsumedJoules=0 | CurrentWatts=0 LowestJoules=0 ConsumedJoules=0 | ||
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s | ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s | ||
</pre> | </pre> |
Revision as of 17:38, 12 July 2016
Cluster Status
The general status of nodes/partitions in a cluster can be viewed using the sinfo and scontrol commands.
sinfo
sinfo will show you the status of partitions in the cluster. Passing the -N flag will show each node individually.
tgray26@opensub00:sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST dpart* up infinite 8 idle openlab[00-07] gpu up infinite 2 idle openlab[08-09]
tgray26@opensub00:sinfo -N NODELIST NODES PARTITION STATE openlab00 1 dpart* idle openlab01 1 dpart* idle openlab02 1 dpart* idle openlab03 1 dpart* idle openlab04 1 dpart* idle openlab05 1 dpart* idle openlab06 1 dpart* idle openlab07 1 dpart* idle openlab08 1 gpu idle openlab09 1 gpu idle
scontrol
The scontrol command, while generally reserved for administrator use, can be used to view the status/configuration of the nodes in the cluster. If passed specific node name(s) only information about those node(s) will be displayed, otherwise all nodes will be listed. To specify multiple nodes, separate each node name by a comma (no spaces).
tgray26@opensub00:scontrol show nodes openlab00,openlab01 NodeName=openlab00 Arch=x86_64 CoresPerSocket=4 CPUAlloc=0 CPUErr=0 CPUTot=8 CPULoad=0.02 AvailableFeatures=(null) ActiveFeatures=(null) Gres=(null) NodeAddr=openlab00 NodeHostName=openlab00 Version=16.05 OS=Linux RealMemory=7822 AllocMem=0 FreeMem=5842 Sockets=2 Boards=1 State=IDLE ThreadsPerCore=1 TmpDisk=49975 Weight=1 Owner=N/A MCS_label=N/A BootTime=2016-07-11T16:40:45 SlurmdStartTime=2016-07-11T23:47:24 CapWatts=n/a CurrentWatts=0 LowestJoules=0 ConsumedJoules=0 ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s NodeName=openlab01 Arch=x86_64 CoresPerSocket=4 CPUAlloc=0 CPUErr=0 CPUTot=8 CPULoad=0.01 AvailableFeatures=(null) ActiveFeatures=(null) Gres=(null) NodeAddr=openlab01 NodeHostName=openlab01 Version=16.05 OS=Linux RealMemory=7822 AllocMem=0 FreeMem=5865 Sockets=2 Boards=1 State=IDLE ThreadsPerCore=1 TmpDisk=49975 Weight=1 Owner=N/A MCS_label=N/A BootTime=2016-07-11T16:40:59 SlurmdStartTime=2016-07-11T23:48:25 CapWatts=n/a CurrentWatts=0 LowestJoules=0 ConsumedJoules=0 ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s