Nexus/CLIP: Difference between revisions
No edit summary |
No edit summary |
||
Line 2: | Line 2: | ||
= Submission Nodes = | = Submission Nodes = | ||
You can [[SSH]] to <code>nexusclip.umiacs.umd.edu</code> to log in to a submission host. | |||
If you store something in a local directory (/tmp, /scratch0) on one of the two submission hosts, you will need to connect to that same submission host to access it later. The actual submission hosts are: | |||
* <code>nexusclip00.umiacs.umd.edu</code> | * <code>nexusclip00.umiacs.umd.edu</code> | ||
* <code>nexusclip01.umiacs.umd.edu</code> | * <code>nexusclip01.umiacs.umd.edu</code> |
Revision as of 23:12, 22 February 2024
The Nexus scheduler houses CLIP's new computational partition. Only CLIP lab members are able to run non-interruptible jobs on these nodes.
Submission Nodes
You can SSH to nexusclip.umiacs.umd.edu
to log in to a submission host.
If you store something in a local directory (/tmp, /scratch0) on one of the two submission hosts, you will need to connect to that same submission host to access it later. The actual submission hosts are:
nexusclip00.umiacs.umd.edu
nexusclip01.umiacs.umd.edu
Resources
The CLIP partition has nodes brought over from the previous standalone CLIP Slurm scheduler as well as some more recent purchases. The compute nodes are named clip##
.
QoS
CLIP users have access to all of the standard job QoSes in the clip
partition using the clip
account.
The additional job QoSes for the CLIP partition specifically are:
huge-long
: Allows for longer jobs using higher overall resources.
Please note that the partition has a GrpTRES
limit of 100% of the available cores/RAM on the partition-specific nodes in aggregate plus 50% of the available cores/RAM on legacy## nodes in aggregate, so your job may need to wait if all available cores/RAM (or GPUs) are in use.
Jobs
You will need to specify --partition=clip
, --account=clip
, and a specific --qos
to be able to submit jobs to the CLIP partition.
[username@nexusclip00:~ ] $ srun --pty --ntasks=4 --mem=8G --qos=default --partition=clip --account=clip --time 1-00:00:00 bash srun: job 218874 queued and waiting for resources srun: job 218874 has been allocated resources [username@clip00:~ ] $ scontrol show job 218874 JobId=218874 JobName=bash UserId=username(1000) GroupId=username(21000) MCS_label=N/A Priority=897 Nice=0 Account=clip QOS=default JobState=RUNNING Reason=None Dependency=(null) Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0 RunTime=00:00:06 TimeLimit=1-00:00:00 TimeMin=N/A SubmitTime=2022-11-18T11:13:56 EligibleTime=2022-11-18T11:13:56 AccrueTime=2022-11-18T11:13:56 StartTime=2022-11-18T11:13:56 EndTime=2022-11-19T11:13:56 Deadline=N/A PreemptEligibleTime=2022-11-18T11:13:56 PreemptTime=None SuspendTime=None SecsPreSuspend=0 LastSchedEval=2022-11-18T11:13:56 Scheduler=Main Partition=clip AllocNode:Sid=nexusclip00:25443 ReqNodeList=(null) ExcNodeList=(null) NodeList=clip00 BatchHost=clip00 NumNodes=1 NumCPUs=4 NumTasks=4 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=4,mem=8G,node=1,billing=2266 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryNode=8G MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=bash WorkDir=/nfshomes/username Power=
Storage
All data filesystems that were available in the standalone CLIP cluster are also available in Nexus.
CLIP users can also request Nexus project allocations.