Nexus/GPUs: Difference between revisions

From UMIACS
Jump to navigation Jump to search
No edit summary
No edit summary
 
(32 intermediate revisions by 3 users not shown)
Line 1: Line 1:
There are several different types of [https://www.nvidia.com/en-us/ NVIDIA] GPUs in the [[Nexus]] cluster that are available to be scheduled. They are listed below in order of newest to oldest architecture, and then alphabetically.
There are several different types of [https://www.nvidia.com/en-us/ NVIDIA] GPUs in the [[Nexus]] cluster that are available to be scheduled. They are listed below in order of newest to oldest architecture, and then alphabetically.
We do not list the exact quantities of GPU here since they change frequently due to additions to or removals from the cluster or during compute node troubleshooting. To see which compute nodes have which GPUs and in what quantities, use the <code>show_nodes</code> command on a submission or compute node. The quantities are listed under the <tt>GRES</tt> column.


{| class="wikitable sortable"
{| class="wikitable sortable"
! Name
! Name
! GRES string ([[SLURM]])
! [https://www.nvidia.com/en-us/technologies Architecture]
! [https://www.nvidia.com/en-us/technologies Architecture]
! [https://developer.nvidia.com/cuda-toolkit CUDA] Cores
! [https://developer.nvidia.com/cuda-toolkit CUDA] Cores
Line 9: Line 12:
! FP32 Performance (TFLOPS)
! FP32 Performance (TFLOPS)
! [https://developer.nvidia.com/blog/accelerating-ai-training-with-tf32-tensor-cores/ TF32] Performance ([https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt Dense / Sparse TOPS])
! [https://developer.nvidia.com/blog/accelerating-ai-training-with-tf32-tensor-cores/ TF32] Performance ([https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt Dense / Sparse TOPS])
|-
| H100 NVLink [0]
| <code>h100-nvl</code>
| Hopper
| 33792
| 188GB HBM3
| 7.87TB/s
| 134
| not officially published/1671
|-
| H100 SXM
| <code>h100-sxm</code>
| Hopper
| 16896
| 80GB HBM3
| 3.35TB/s
| 67
| not officially published/989
|-
| L40S
| <code>l40s</code>
| Ada Lovelace
| 18176
| 48GB GDDR6
| 864GB/s
| 91.6
| 183/366
|-
| RTX 6000 Ada Generation
| <code>rtx6000ada</code>
| Ada Lovelace
| 18176
| 48GB GDDR6
| 960GB/s
| 91.1
| 182.1/364.2
|-
| A100 PCIe 80GB
| <code>a100</code>
| Ampere
| 6912
| 80GB HBM2e
| 1.94TB/s
| 19.5
| 156/312
|-
| A100 SXM 80GB
| <code>a100</code>
| Ampere
| 6912
| 80GB HBM2e
| 2.04TB/s
| 19.5
| 156/312
|-
|-
| GeForce RTX 3070
| GeForce RTX 3070
| <code>rtx3070</code>
| Ampere
| Ampere
| 5888
| 5888
Line 19: Line 77:
|-
|-
| GeForce RTX 3090
| GeForce RTX 3090
| <code>rtx3090</code>
| Ampere
| Ampere
| 10496
| 10496
Line 27: Line 86:
|-
|-
| RTX A4000
| RTX A4000
| <code>rtxa4000</code>
| Ampere
| Ampere
| 6144
| 6144
Line 35: Line 95:
|-
|-
| RTX A5000
| RTX A5000
| <code>rtxa5000</code>
| Ampere
| Ampere
| 8192
| 8192
Line 43: Line 104:
|-
|-
| RTX A6000
| RTX A6000
| <code>rtxa6000</code>
| Ampere
| Ampere
| 10752
| 10752
Line 51: Line 113:
|-
|-
| GeForce RTX 2080 Ti
| GeForce RTX 2080 Ti
| <code>rtx2080ti</code>
| Turing
| Turing
| 4352
| 4352
Line 59: Line 122:
|-
|-
| GeForce GTX 1080 Ti
| GeForce GTX 1080 Ti
| <code>gtx1080ti</code>
| Pascal
| Pascal
| 3584
| 3584
Line 64: Line 128:
| 484GB/s
| 484GB/s
| 11.3
| 11.3
| n/a
|-
| GeForce GTX Titan Xp
| Pascal
| 3840
| 12GB GDDR5X
| 548GB/s
| 12.1
| n/a
| n/a
|-
|-
| Quadro P6000
| Quadro P6000
| <code>p6000</code>
| Pascal
| Pascal
| 3840
| 3840
Line 83: Line 140:
|-
|-
| Tesla P100
| Tesla P100
| <code>p100</code>
| Pascal
| Pascal
| 3584
| 3584
Line 90: Line 148:
| n/a
| n/a
|-
|-
| GeForce GTX Titan X
| TITAN X (Pascal)
| <code>titanxpascal</code>
| Pascal
| 3584
| 12GB GDDR5X
| 480GB/s
| 11.0
| n/a
|-
| TITAN Xp
| <code>titanxp</code>
| Pascal
| 3840
| 12GB GDDR5X
| 548GB/s
| 12.1
| n/a
|-
| GeForce GTX TITAN X
| <code>gtxtitanx</code>
| Maxwell
| Maxwell
| 3072
| 3072
Line 96: Line 173:
| 336GB/s
| 336GB/s
| 6.7
| 6.7
| n/a
|-
| Tesla M40
| Maxwell
| 3072
| 12GB GDDR5
| 288GB/s
| 6.8
| n/a
|-
| Tesla K80
| Kepler
| 4992
| 24GB GDDR5
| 480GB/s
| 8.7
| n/a
| n/a
|-
|-
|}
|}
[0] - This GPU is actually a pair of two physical cards connected over [https://www.nvidia.com/en-us/data-center/nvlink NVLink] bridges. NVIDIA's provided specifications for this GPU type are for one physical card; to get these specs, we have hence doubled their advertised values.

Latest revision as of 21:04, 6 November 2024

There are several different types of NVIDIA GPUs in the Nexus cluster that are available to be scheduled. They are listed below in order of newest to oldest architecture, and then alphabetically.

We do not list the exact quantities of GPU here since they change frequently due to additions to or removals from the cluster or during compute node troubleshooting. To see which compute nodes have which GPUs and in what quantities, use the show_nodes command on a submission or compute node. The quantities are listed under the GRES column.

Name GRES string (SLURM) Architecture CUDA Cores GPU Memory Memory Bandwidth FP32 Performance (TFLOPS) TF32 Performance (Dense / Sparse TOPS)
H100 NVLink [0] h100-nvl Hopper 33792 188GB HBM3 7.87TB/s 134 not officially published/1671
H100 SXM h100-sxm Hopper 16896 80GB HBM3 3.35TB/s 67 not officially published/989
L40S l40s Ada Lovelace 18176 48GB GDDR6 864GB/s 91.6 183/366
RTX 6000 Ada Generation rtx6000ada Ada Lovelace 18176 48GB GDDR6 960GB/s 91.1 182.1/364.2
A100 PCIe 80GB a100 Ampere 6912 80GB HBM2e 1.94TB/s 19.5 156/312
A100 SXM 80GB a100 Ampere 6912 80GB HBM2e 2.04TB/s 19.5 156/312
GeForce RTX 3070 rtx3070 Ampere 5888 8GB GDDR6 448GB/s 20.3 20.3/40.6
GeForce RTX 3090 rtx3090 Ampere 10496 24GB GDDR6X 936GB/s 35.6 35.6/71
RTX A4000 rtxa4000 Ampere 6144 16GB GDDR6 448GB/s 19.2 not officially published
RTX A5000 rtxa5000 Ampere 8192 24GB GDDR6 768GB/s 27.8 not officially published
RTX A6000 rtxa6000 Ampere 10752 48GB GDDR6 768GB/s 38.7 77.4/154.8
GeForce RTX 2080 Ti rtx2080ti Turing 4352 11GB GDDR5X 616GB/s 13.4 n/a
GeForce GTX 1080 Ti gtx1080ti Pascal 3584 11GB GDDR5X 484GB/s 11.3 n/a
Quadro P6000 p6000 Pascal 3840 24GB GDDR5X 432GB/s 12.6 n/a
Tesla P100 p100 Pascal 3584 16GB CoWoS HBM2 732GB/s 9.3 n/a
TITAN X (Pascal) titanxpascal Pascal 3584 12GB GDDR5X 480GB/s 11.0 n/a
TITAN Xp titanxp Pascal 3840 12GB GDDR5X 548GB/s 12.1 n/a
GeForce GTX TITAN X gtxtitanx Maxwell 3072 12GB GDDR5 336GB/s 6.7 n/a

[0] - This GPU is actually a pair of two physical cards connected over NVLink bridges. NVIDIA's provided specifications for this GPU type are for one physical card; to get these specs, we have hence doubled their advertised values.