Nexus/CLIP: Difference between revisions
(→QoS) |
No edit summary |
||
Line 1: | Line 1: | ||
The [[Nexus]] scheduler houses [https://wiki.umiacs.umd.edu/clip/index.php/Main_Page CLIP]'s new computational partition. | The [[Nexus]] scheduler houses [https://wiki.umiacs.umd.edu/clip/index.php/Main_Page CLIP]'s new computational partition. Only CLIP lab members are able to run non-interruptible jobs on these nodes. | ||
= Submission Nodes = | = Submission Nodes = |
Revision as of 20:41, 5 June 2023
The Nexus scheduler houses CLIP's new computational partition. Only CLIP lab members are able to run non-interruptible jobs on these nodes.
Submission Nodes
There are two submission nodes for Nexus exclusively available for CLIP users.
nexusclip00.umiacs.umd.edu
nexusclip01.umiacs.umd.edu
Resources
The CLIP partition has nodes brought over from the previous standalone CLIP Slurm scheduler as well as some more recent purchases. The compute nodes are named clip##
.
QoS
CLIP users have access to all of the standard QoS' in the clip
partition using the clip
account.
The additional QoSes for the CLIP partition specifically are:
huge-long
: Allows for longer jobs using higher overall resources.
Please note that the partition has a GrpTRES
limit of 100% of the available cores/RAM on the partition-specific nodes plus 50% of the available cores/RAM on legacy## nodes, so your job may need to wait if all available cores/RAM (or GPUs) are in use.
Jobs
You will need to specify --partition=clip
, --account=clip
, and a specific --qos
to be able to submit jobs to the CLIP partition.
[username@nexusclip00:~ ] $ srun --pty --ntasks=4 --mem=8G --qos=default --partition=clip --account=clip --time 1-00:00:00 bash srun: job 218874 queued and waiting for resources srun: job 218874 has been allocated resources [username@clip00:~ ] $ scontrol show job 218874 JobId=218874 JobName=bash UserId=username(1000) GroupId=username(21000) MCS_label=N/A Priority=897 Nice=0 Account=clip QOS=default JobState=RUNNING Reason=None Dependency=(null) Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0 RunTime=00:00:06 TimeLimit=1-00:00:00 TimeMin=N/A SubmitTime=2022-11-18T11:13:56 EligibleTime=2022-11-18T11:13:56 AccrueTime=2022-11-18T11:13:56 StartTime=2022-11-18T11:13:56 EndTime=2022-11-19T11:13:56 Deadline=N/A PreemptEligibleTime=2022-11-18T11:13:56 PreemptTime=None SuspendTime=None SecsPreSuspend=0 LastSchedEval=2022-11-18T11:13:56 Scheduler=Main Partition=clip AllocNode:Sid=nexuscbcb00:25443 ReqNodeList=(null) ExcNodeList=(null) NodeList=clip00 BatchHost=clip00 NumNodes=1 NumCPUs=16 NumTasks=16 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=16,mem=2000G,node=1,billing=2266 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryNode=2000G MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=bash WorkDir=/nfshomes/username Power=
Storage
All data filesystems that were available in the standalone CLIP cluster are also available in Nexus.
CLIP users can also request Nexus project allocations.