ClassAccounts: Difference between revisions

From UMIACS
Jump to navigation Jump to search
No edit summary
 
(36 intermediate revisions by 2 users not shown)
Line 1: Line 1:
==Overview==
==Overview==
UMIACS Class Accounts are currently intended to support classes for all of UMIACS/CSD via the [[Nexus]] cluster.  All new class accounts are serviced solely through this cluster.  Faculty may request that a class be supported by following the instructions [[ClassAccounts/Manage | here]].
UMIACS Class Accounts support classes for all of UMIACS/CSD via the [[Nexus]] cluster.  Faculty may request that a class be supported by following the instructions [[ClassAccounts/Manage | here]].


==Getting an account==
==Getting an account==
Your TA will request an account for you. Once this is done, you will be notified by email that you have an account to redeem.  If you have not received an email, please contact your TA. '''You must redeem the account within 7 days or else the redemption token will expire.'''  If your redemption token does expire, please contact your TA to have it renewed.
Your TA or instructor will request an account for you. Once this is done, you will be notified by email that you have an account to redeem.  If you have not received an email, please contact your TA or instructor. '''You must redeem the account within 7 days or else the redemption token will expire.'''  If your redemption token does expire, please contact your TA or instructor to have it renewed.


Once you do redeem your account, you will need to wait until you get a confirmation email that your account has been installed.  This is typically done once a day on days that the University is open for business.
Once you do redeem your account, you will need to wait until you get a confirmation email that your account has been installed.  This is typically done once a day on days that the University is open for business.
'''Any questions or issues with your account, storage, or cluster use must first be made through your TA or instructor.'''


===Registering for Duo===
===Registering for Duo===
UMIACS requires that all Class accounts be registered for MFA (multi-factor authentication) under our [[Duo]] instance (note that this is different than UMD's general Duo instance). '''You will not be able to log onto the class submission host until you register.'''
UMIACS requires that all Class accounts register for MFA (multi-factor authentication) under our [[Duo]] instance (note that this is different than UMD's general Duo instance). '''You will not be able to log onto the class submission nodes until you register.'''


If you see the following error in your SSH client you have not yet enrolled/registered in Duo.
If you see the following error in your SSH client, you have not yet enrolled/registered in Duo.


<pre>
<pre>
Line 16: Line 18:
</pre>
</pre>


In order to register, [https://intranet.umiacs.umd.edu/directory visit our directory app] and log in with your Class username and password. You will then receive a prompt to enroll in Duo. For assistance in enrollment, you can visit our [[Duo | Duo help page]].
In order to register, [https://intranet.umiacs.umd.edu/directory visit our directory app] and log in with your Class username and password. You will then receive a prompt to enroll in Duo. For assistance in enrollment, please visit our [[Duo | Duo help page]].
 
Once notified that your account has been installed and you have registered in our Duo instance, you can [[SSH]] to <code>nexusclass.umiacs.umd.edu</code> with your assigned username and your chosen password to log in to a submission node.


Once notified that your account has been installed and you have registered in our Duo instance, you can access the following class submission host(s) using [[SSH]] with your assigned username and your chosen password:
If you store something in a local filesystem directory (/tmp, /scratch0) on one of the two submission nodes, you will need to connect to that same submission node to access it later. The actual submission nodes are:
* <code>nexusclass00.umiacs.umd.edu</code> or <code>nexusclass01.umiacs.umd.edu</code>
* <code>nexusclass00.umiacs.umd.edu</code>
* <code>nexusclass01.umiacs.umd.edu</code>


==Cleaning up your account before the end of the semester==
==Cleaning up your account before the end of the semester==
Class accounts for a given semester will be archived and deleted after that semester's completion as early as the following:
Class accounts for a given semester are liable to be archived and deleted after that semester's completion as early as the following:
* Winter semesters: February 1st of same year
* Spring semesters: June 1st of same year
* Spring semesters: June 1st of same year
* Summer semesters: September 1st of same year
* Summer semesters: September 1st of same year
Line 30: Line 36:


==Personal Storage==
==Personal Storage==
Your home directory has a quota of 20GB and is located at:
Your home directory has a quota of 30GB and is located at:
<pre>
<pre>
/fs/classhomes/<semester><year>/<coursecode>/<username>
/fs/classhomes/<semester><year>/<coursecode>/<username>
Line 37: Line 43:
where <code><semester></code> is either "spring", "summer", "fall", or "winter", <code><year></code> is the current year e.g., "2021",  <coursecode> is the class' course code as listed in UMD's [https://app.testudo.umd.edu/soc/ Schedule of Classes] in all lowercase e.g., "cmsc999z", and <code><username></code> is the username mentioned in the email you received to redeem the account e.g., "c999z000".
where <code><semester></code> is either "spring", "summer", "fall", or "winter", <code><year></code> is the current year e.g., "2021",  <coursecode> is the class' course code as listed in UMD's [https://app.testudo.umd.edu/soc/ Schedule of Classes] in all lowercase e.g., "cmsc999z", and <code><username></code> is the username mentioned in the email you received to redeem the account e.g., "c999z000".


You can request up to another 100GB of personal storage if you would like by having your TA [[HelpDesk | contact staff]]. This storage will be located at
You can request up to another 100GB of personal storage if you would like by '''having your TA or instructor [[HelpDesk | contact staff]]'''. This storage will be located at
<pre>
<pre>
/fs/class-projects/<semester><year>/<coursecode>/<username>
/fs/class-projects/<semester><year>/<coursecode>/<username>
Line 43: Line 49:


==Group Storage==
==Group Storage==
You can also request group storage if you would like by having your TA [[HelpDesk | contact staff]] to specify the usernames of the accounts that should be in the group. Only other class accounts in the same class can be added to the group. The quota will be 100GB multiplied by the number of accounts in the group and will be located at
You can also request group storage by '''having your TA or instructor [[HelpDesk | contact staff]]''' to specify the usernames of the accounts that should be in the group. Only other class accounts in the same class can be added to the group. The quota will be 100GB multiplied by the number of accounts in the group and will be located at
<pre>
<pre>
/fs/class-projects/<semester><year>/<coursecode>/<groupname>
/fs/class-projects/<semester><year>/<coursecode>/<groupname>
Line 56: Line 62:


==Cluster Usage==
==Cluster Usage==
'''You may not run computational jobs on any submission host.'''  You must schedule your jobs with the [[SLURM]] workload manager.  You can also find out more with the public documentation for the [https://slurm.schedmd.com/quickstart.html SLURM Workload Manager].
'''You may not run computational jobs on any submission node.'''  You must [[SLURM/JobSubmission | schedule your jobs with the SLURM workload manager]].  You can also find out more with the public documentation for the [https://slurm.schedmd.com/quickstart.html SLURM Workload Manager].
 
Class accounts only have access to the following submission parameters in SLURM:
* <code>--partition</code> - <code>class</code>
* <code>--account</code> - <code>class</code>
* <code>--qos</code> - <code>default</code>, <code>medium</code>, and <code>high</code>


'''Any questions or issues with the cluster must be first made through your TA.'''
You must specify at least the partition parameter manually in any submission command you run. If you do not specify any QoS parameter, you will receive the QoS <code>default</code>.


Class accounts only have access to the following submission parameters in SLURM.  You may be required to explicitly set each of these in your submission parameters.
You can view the resource limits for each QoS by running the command <code>show_qos</code>. The value in the MaxWall column is the maximum runtime you can run a single job for each QoS, and the values in the MaxTRES column are the maximum amount of CPU cores/GPUs/memory you can request for a single job using each QoS.


* Partition - <code>class</code>
Please note that you will be restricted to 32 total CPU cores, 4 total GPUs, and 256GB total RAM across all jobs you have running at once. This can be viewed with the command <code>show_partition_qos</code>.
* Account - <code>class</code>
* QoS - <code>default</code>, <code>medium</code>, and <code>high</code>


===Example===
===Example===
Here is a basic example to schedule a interactive job running bash with a single GPU in the partition <code>class</code> with the account <code>class</code> running with the QoS of <code>default</code>.
Here is a basic example to schedule a interactive job running bash with a single GPU in the partition <code>class</code>, with the account <code>class</code>, running with the QoS of <code>default</code> and the default CPU/memory allocation/time limit for the partition.


<pre>
<pre>
$ srun --pty --partition=class --account=class --qos=default --gres=gpu:1 bash
$ hostname
</pre>
nexusclass00.umiacs.umd.edu
 
$ srun --partition=class --account=class --qos=default --gres=gpu:1 --pty bash
srun: Job time limit was unset; set to partition default of 60 minutes
srun: job 1333337 queued and waiting for resources
srun: job 1333337 has been allocated resources


<pre>
$ hostname
bash-4.4$ hostname
tron14.umiacs.umd.edu
tron14.umiacs.umd.edu
bash-4.4$ nvidia-smi -L
 
$ nvidia-smi -L
GPU 0: NVIDIA RTX A4000 (UUID: GPU-55f2d3b7-9162-8b02-50de-476a012c626c)
GPU 0: NVIDIA RTX A4000 (UUID: GPU-55f2d3b7-9162-8b02-50de-476a012c626c)
</pre>
</pre>


===Available Nodes===
===Available Nodes===
You can list the available nodes and their current state with the <code>show_nodes -p class</code> command.  This list of nodes is not completely static as nodes may be pulled out of service to repair/replace GPUs or other components.
You can list the available nodes and their current state with the <code>show_nodes -p class</code> command.  This list of nodes is not completely static as nodes may be pulled out of service to troubleshoot GPUs or other components.


<pre>
<pre>
$ show_nodes -p class
$ show_nodes -p class
NODELIST            CPUS      MEMORY    AVAIL_FEATURES           GRES                            STATE     PARTITION
NODELIST            CPUS      MEMORY    AVAIL_FEATURES                           GRES                            STATE
tron06              16        128520     rhel8,AMD,EPYC-7302P     gpu:rtxa4000:4                  idle       class
tron06              16        126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere      gpu:rtxa4000:4                  idle
tron07              16        128520     rhel8,AMD,EPYC-7302P     gpu:rtxa4000:4                  idle       class
tron07              16        126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere      gpu:rtxa4000:4                  idle
tron08              16        128520     rhel8,AMD,EPYC-7302P     gpu:rtxa4000:4                  idle       class
tron08              16        126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere      gpu:rtxa4000:4                  idle
tron09              16        128520     rhel8,AMD,EPYC-7302P     gpu:rtxa4000:4                  idle       class
tron09              16        126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere      gpu:rtxa4000:4                  idle
tron10              16        128524     rhel8,Zen,EPYC-7313P     gpu:rtxa4000:4                  idle       class
tron10              16        126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere      gpu:rtxa4000:4                  idle
tron11              16        128524     rhel8,Zen,EPYC-7313P     gpu:rtxa4000:4                  idle       class
tron11              16        126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere      gpu:rtxa4000:4                  idle
tron12              16        128525     rhel8,AMD,EPYC-7302P     gpu:rtxa4000:4                  idle       class
tron12              16        126218     rhel8,x86_64,Zen,EPYC-7302P,Ampere      gpu:rtxa4000:4                  idle
tron13              16        128520     rhel8,AMD,EPYC-7302P     gpu:rtxa4000:4                  idle       class
tron13              16        126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere      gpu:rtxa4000:4                  idle
tron14              16        128520     rhel8,AMD,EPYC-7302P     gpu:rtxa4000:4                  idle       class
tron14              16        126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere      gpu:rtxa4000:4                  idle
tron15              16        128520     rhel8,AMD,EPYC-7302P     gpu:rtxa4000:4                  idle       class
tron15              16        126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere      gpu:rtxa4000:4                  idle
tron16              16        128524     rhel8,Zen,EPYC-7313P     gpu:rtxa4000:4                  idle       class
tron16              16        126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere      gpu:rtxa4000:4                  idle
tron17              16        128524     rhel8,Zen,EPYC-7313P     gpu:rtxa4000:4                  idle       class
tron17              16        126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere      gpu:rtxa4000:4                  idle
tron18              16        128524     rhel8,Zen,EPYC-7313P     gpu:rtxa4000:4                  idle       class
tron18              16        126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere      gpu:rtxa4000:4                  idle
tron19              16        128524     rhel8,Zen,EPYC-7313P     gpu:rtxa4000:4                  idle       class
tron19              16        126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere      gpu:rtxa4000:4                  idle
tron20              16        128524     rhel8,Zen,EPYC-7313P     gpu:rtxa4000:4                  idle       class
tron20              16        126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere      gpu:rtxa4000:4                  idle
tron21              16        128525     rhel8,AMD,EPYC-7302P     gpu:rtxa4000:4                  idle       class
tron21              16        126218     rhel8,x86_64,Zen,EPYC-7302P,Ampere      gpu:rtxa4000:4                  idle
tron22              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron22              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron23              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron23              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron24              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron24              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron25              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron25              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron26              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron26              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron27              16        128521     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron27              16        126214     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron28              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron28              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron29              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron29              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron30              16        128521     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron30              16        126214     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron31              16        128521     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron31              16        126214     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron32              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron32              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron33              16        128521     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron33              16        126214     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron34              16        128524     rhel8,Zen,EPYC-7313P     gpu:rtxa4000:4                  idle       class
tron34              16        126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere      gpu:rtxa4000:4                  idle
tron35              16        128521     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron35              16        126214     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron36              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron36              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron37              16        128521     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron37              16        126214     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron38              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron38              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron39              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron39              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron40              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron40              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron41              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron41              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron42              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron42              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron43              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron43              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron44              16        128525     rhel8,AMD,EPYC-7302       gpu:rtxa4000:4                  idle       class
tron44              16        126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                  idle
tron45               16         128525     rhel8,AMD,EPYC-7302      gpu:rtxa4000:4                   idle       class
tron46              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron47              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron48              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron49              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron50              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron51              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron52              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron53              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron54              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron55              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron56              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron57              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron58              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron59              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron60              48        255232    rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                  idle
tron61               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
</pre>
</pre>


Line 131: Line 160:


<pre>
<pre>
$ scontrol show node tron27
$ scontrol show node tron06
NodeName=tron27 Arch=x86_64 CoresPerSocket=16
NodeName=tron06 Arch=x86_64 CoresPerSocket=16
   CPUAlloc=0 CPUTot=16 CPULoad=0.00
   CPUAlloc=0 CPUEfctv=16 CPUTot=16 CPULoad=0.08
   AvailableFeatures=rhel8,AMD,EPYC-7302
   AvailableFeatures=rhel8,x86_64,Zen,EPYC-7302P,Ampere
   ActiveFeatures=rhel8,AMD,EPYC-7302
   ActiveFeatures=rhel8,x86_64,Zen,EPYC-7302P,Ampere
   Gres=gpu:rtxa4000:4
   Gres=gpu:rtxa4000:4
   NodeAddr=tron27 NodeHostName=tron27 Version=21.08.8-2
   NodeAddr=tron06 NodeHostName=tron06 Version=23.02.6
   OS=Linux 4.18.0-372.19.1.el8_6.x86_64 #1 SMP Mon Jul 18 11:14:02 EDT 2022
   OS=Linux 4.18.0-513.11.1.el8_9.x86_64 #1 SMP Thu Dec 7 03:06:13 EST 2023
   RealMemory=128521 AllocMem=0 FreeMem=125650 Sockets=1 Boards=1
   RealMemory=126214 AllocMem=0 FreeMem=107174 Sockets=1 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=10 Owner=N/A MCS_label=N/A
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=340 Owner=N/A MCS_label=N/A
   Partitions=class,scavenger,tron
   Partitions=class,scavenger,tron
   BootTime=2022-08-18T17:34:44 SlurmdStartTime=2022-08-19T13:10:47
   BootTime=2024-01-29T09:35:12 SlurmdStartTime=2024-02-05T15:14:20
   LastBusyTime=2022-08-22T11:20:18
   LastBusyTime=2024-02-16T15:59:38 ResumeAfterTime=None
   CfgTRES=cpu=16,mem=128521M,billing=173,gres/gpu=4,gres/gpu:rtxa4000=4
   CfgTRES=cpu=16,mem=126214M,billing=638,gres/gpu=4,gres/gpu:rtxa4000=4
   AllocTRES=
   AllocTRES=
   CapWatts=n/a
   CapWatts=n/a

Latest revision as of 16:38, 16 December 2024

Overview

UMIACS Class Accounts support classes for all of UMIACS/CSD via the Nexus cluster. Faculty may request that a class be supported by following the instructions here.

Getting an account

Your TA or instructor will request an account for you. Once this is done, you will be notified by email that you have an account to redeem. If you have not received an email, please contact your TA or instructor. You must redeem the account within 7 days or else the redemption token will expire. If your redemption token does expire, please contact your TA or instructor to have it renewed.

Once you do redeem your account, you will need to wait until you get a confirmation email that your account has been installed. This is typically done once a day on days that the University is open for business.

Any questions or issues with your account, storage, or cluster use must first be made through your TA or instructor.

Registering for Duo

UMIACS requires that all Class accounts register for MFA (multi-factor authentication) under our Duo instance (note that this is different than UMD's general Duo instance). You will not be able to log onto the class submission nodes until you register.

If you see the following error in your SSH client, you have not yet enrolled/registered in Duo.

Access is not allowed because you are not enrolled in Duo. Please contact your organization's IT help desk.

In order to register, visit our directory app and log in with your Class username and password. You will then receive a prompt to enroll in Duo. For assistance in enrollment, please visit our Duo help page.

Once notified that your account has been installed and you have registered in our Duo instance, you can SSH to nexusclass.umiacs.umd.edu with your assigned username and your chosen password to log in to a submission node.

If you store something in a local filesystem directory (/tmp, /scratch0) on one of the two submission nodes, you will need to connect to that same submission node to access it later. The actual submission nodes are:

  • nexusclass00.umiacs.umd.edu
  • nexusclass01.umiacs.umd.edu

Cleaning up your account before the end of the semester

Class accounts for a given semester are liable to be archived and deleted after that semester's completion as early as the following:

  • Winter semesters: February 1st of same year
  • Spring semesters: June 1st of same year
  • Summer semesters: September 1st of same year
  • Fall semesters: January 1st of next year

It is your responsibility to ensure you have backed up anything you want to keep from your class account's personal or group storage (below sections) prior to the relevant date.

Personal Storage

Your home directory has a quota of 30GB and is located at:

/fs/classhomes/<semester><year>/<coursecode>/<username>

where <semester> is either "spring", "summer", "fall", or "winter", <year> is the current year e.g., "2021", <coursecode> is the class' course code as listed in UMD's Schedule of Classes in all lowercase e.g., "cmsc999z", and <username> is the username mentioned in the email you received to redeem the account e.g., "c999z000".

You can request up to another 100GB of personal storage if you would like by having your TA or instructor contact staff. This storage will be located at

/fs/class-projects/<semester><year>/<coursecode>/<username>

Group Storage

You can also request group storage by having your TA or instructor contact staff to specify the usernames of the accounts that should be in the group. Only other class accounts in the same class can be added to the group. The quota will be 100GB multiplied by the number of accounts in the group and will be located at

/fs/class-projects/<semester><year>/<coursecode>/<groupname>

where <groupname> is composed of:

  • the abbreviated course code as used in the username e.g., "c999z"
  • the character "g"
  • the number of the group (starting at 0 for the first group for the class requested to us) prepended with 0s to make the total group name 8 characters long

e.g., "c999zg00".

Cluster Usage

You may not run computational jobs on any submission node. You must schedule your jobs with the SLURM workload manager. You can also find out more with the public documentation for the SLURM Workload Manager.

Class accounts only have access to the following submission parameters in SLURM:

  • --partition - class
  • --account - class
  • --qos - default, medium, and high

You must specify at least the partition parameter manually in any submission command you run. If you do not specify any QoS parameter, you will receive the QoS default.

You can view the resource limits for each QoS by running the command show_qos. The value in the MaxWall column is the maximum runtime you can run a single job for each QoS, and the values in the MaxTRES column are the maximum amount of CPU cores/GPUs/memory you can request for a single job using each QoS.

Please note that you will be restricted to 32 total CPU cores, 4 total GPUs, and 256GB total RAM across all jobs you have running at once. This can be viewed with the command show_partition_qos.

Example

Here is a basic example to schedule a interactive job running bash with a single GPU in the partition class, with the account class, running with the QoS of default and the default CPU/memory allocation/time limit for the partition.

$ hostname
nexusclass00.umiacs.umd.edu

$ srun --partition=class --account=class --qos=default --gres=gpu:1 --pty bash
srun: Job time limit was unset; set to partition default of 60 minutes
srun: job 1333337 queued and waiting for resources
srun: job 1333337 has been allocated resources

$ hostname
tron14.umiacs.umd.edu

$ nvidia-smi -L
GPU 0: NVIDIA RTX A4000 (UUID: GPU-55f2d3b7-9162-8b02-50de-476a012c626c)

Available Nodes

You can list the available nodes and their current state with the show_nodes -p class command. This list of nodes is not completely static as nodes may be pulled out of service to troubleshoot GPUs or other components.

$ show_nodes -p class
NODELIST             CPUS       MEMORY     AVAIL_FEATURES                           GRES                             STATE
tron06               16         126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere       gpu:rtxa4000:4                   idle
tron07               16         126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere       gpu:rtxa4000:4                   idle
tron08               16         126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere       gpu:rtxa4000:4                   idle
tron09               16         126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere       gpu:rtxa4000:4                   idle
tron10               16         126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere       gpu:rtxa4000:4                   idle
tron11               16         126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere       gpu:rtxa4000:4                   idle
tron12               16         126218     rhel8,x86_64,Zen,EPYC-7302P,Ampere       gpu:rtxa4000:4                   idle
tron13               16         126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere       gpu:rtxa4000:4                   idle
tron14               16         126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere       gpu:rtxa4000:4                   idle
tron15               16         126214     rhel8,x86_64,Zen,EPYC-7302P,Ampere       gpu:rtxa4000:4                   idle
tron16               16         126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere       gpu:rtxa4000:4                   idle
tron17               16         126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere       gpu:rtxa4000:4                   idle
tron18               16         126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere       gpu:rtxa4000:4                   idle
tron19               16         126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere       gpu:rtxa4000:4                   idle
tron20               16         126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere       gpu:rtxa4000:4                   idle
tron21               16         126218     rhel8,x86_64,Zen,EPYC-7302P,Ampere       gpu:rtxa4000:4                   idle
tron22               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron23               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron24               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron25               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron26               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron27               16         126214     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron28               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron29               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron30               16         126214     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron31               16         126214     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron32               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron33               16         126214     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron34               16         126217     rhel8,x86_64,Zen,EPYC-7313P,Ampere       gpu:rtxa4000:4                   idle
tron35               16         126214     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron36               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron37               16         126214     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron38               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron39               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron40               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron41               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron42               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron43               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron44               16         126218     rhel8,x86_64,Zen,EPYC-7302,Ampere        gpu:rtxa4000:4                   idle
tron46               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron47               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron48               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron49               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron50               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron51               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron52               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron53               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron54               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron55               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron56               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron57               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron58               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron59               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron60               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle
tron61               48         255232     rhel8,x86_64,Zen,EPYC-7352,Ampere        gpu:rtxa5000:8                   idle

You can also find more granular information about an individual node with the scontrol show node command.

$ scontrol show node tron06
NodeName=tron06 Arch=x86_64 CoresPerSocket=16
   CPUAlloc=0 CPUEfctv=16 CPUTot=16 CPULoad=0.08
   AvailableFeatures=rhel8,x86_64,Zen,EPYC-7302P,Ampere
   ActiveFeatures=rhel8,x86_64,Zen,EPYC-7302P,Ampere
   Gres=gpu:rtxa4000:4
   NodeAddr=tron06 NodeHostName=tron06 Version=23.02.6
   OS=Linux 4.18.0-513.11.1.el8_9.x86_64 #1 SMP Thu Dec 7 03:06:13 EST 2023
   RealMemory=126214 AllocMem=0 FreeMem=107174 Sockets=1 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=340 Owner=N/A MCS_label=N/A
   Partitions=class,scavenger,tron
   BootTime=2024-01-29T09:35:12 SlurmdStartTime=2024-02-05T15:14:20
   LastBusyTime=2024-02-16T15:59:38 ResumeAfterTime=None
   CfgTRES=cpu=16,mem=126214M,billing=638,gres/gpu=4,gres/gpu:rtxa4000=4
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s