Nexus/GPUs: Difference between revisions
No edit summary |
No edit summary |
||
(20 intermediate revisions by 3 users not shown) | |||
Line 5: | Line 5: | ||
{| class="wikitable sortable" | {| class="wikitable sortable" | ||
! Name | ! Name | ||
! GRES string ([[SLURM]]) | |||
! [https://www.nvidia.com/en-us/technologies Architecture] | ! [https://www.nvidia.com/en-us/technologies Architecture] | ||
! [https://developer.nvidia.com/cuda-toolkit CUDA] Cores | ! [https://developer.nvidia.com/cuda-toolkit CUDA] Cores | ||
Line 12: | Line 13: | ||
! [https://developer.nvidia.com/blog/accelerating-ai-training-with-tf32-tensor-cores/ TF32] Performance ([https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt Dense / Sparse TOPS]) | ! [https://developer.nvidia.com/blog/accelerating-ai-training-with-tf32-tensor-cores/ TF32] Performance ([https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt Dense / Sparse TOPS]) | ||
|- | |- | ||
| A100 | | H100 NVLink [0] | ||
| <code>h100-nvl</code> | |||
| Hopper | |||
| 33792 | |||
| 188GB HBM3 | |||
| 7.87TB/s | |||
| 134 | |||
| not officially published/1671 | |||
|- | |||
| H100 SXM | |||
| <code>h100-sxm</code> | |||
| Hopper | |||
| 16896 | |||
| 80GB HBM3 | |||
| 3.35TB/s | |||
| 67 | |||
| not officially published/989 | |||
|- | |||
| L40S | |||
| <code>l40s</code> | |||
| Ada Lovelace | |||
| 18176 | |||
| 48GB GDDR6 | |||
| 864GB/s | |||
| 91.6 | |||
| 183/366 | |||
|- | |||
| RTX 6000 Ada Generation | |||
| <code>rtx6000ada</code> | |||
| Ada Lovelace | |||
| 18176 | |||
| 48GB GDDR6 | |||
| 960GB/s | |||
| 91.1 | |||
| 182.1/364.2 | |||
|- | |||
| A100 PCIe 80GB | |||
| <code>a100</code> | |||
| Ampere | |||
| 6912 | |||
| 80GB HBM2e | |||
| 1.94TB/s | |||
| 19.5 | |||
| 156/312 | |||
|- | |||
| A100 SXM 80GB | |||
| <code>a100</code> | |||
| Ampere | | Ampere | ||
| 6912 | | 6912 | ||
| 80GB HBM2e | | 80GB HBM2e | ||
| | | 2.04TB/s | ||
| 19.5 | | 19.5 | ||
| 156/312 | | 156/312 | ||
|- | |- | ||
| GeForce RTX 3070 | | GeForce RTX 3070 | ||
| <code>rtx3070</code> | |||
| Ampere | | Ampere | ||
| 5888 | | 5888 | ||
Line 29: | Line 77: | ||
|- | |- | ||
| GeForce RTX 3090 | | GeForce RTX 3090 | ||
| <code>rtx3090</code> | |||
| Ampere | | Ampere | ||
| 10496 | | 10496 | ||
Line 37: | Line 86: | ||
|- | |- | ||
| RTX A4000 | | RTX A4000 | ||
| <code>rtxa4000</code> | |||
| Ampere | | Ampere | ||
| 6144 | | 6144 | ||
Line 45: | Line 95: | ||
|- | |- | ||
| RTX A5000 | | RTX A5000 | ||
| <code>rtxa5000</code> | |||
| Ampere | | Ampere | ||
| 8192 | | 8192 | ||
Line 53: | Line 104: | ||
|- | |- | ||
| RTX A6000 | | RTX A6000 | ||
| <code>rtxa6000</code> | |||
| Ampere | | Ampere | ||
| 10752 | | 10752 | ||
Line 61: | Line 113: | ||
|- | |- | ||
| GeForce RTX 2080 Ti | | GeForce RTX 2080 Ti | ||
| <code>rtx2080ti</code> | |||
| Turing | | Turing | ||
| 4352 | | 4352 | ||
Line 69: | Line 122: | ||
|- | |- | ||
| GeForce GTX 1080 Ti | | GeForce GTX 1080 Ti | ||
| <code>gtx1080ti</code> | |||
| Pascal | | Pascal | ||
| 3584 | | 3584 | ||
Line 74: | Line 128: | ||
| 484GB/s | | 484GB/s | ||
| 11.3 | | 11.3 | ||
| n/a | | n/a | ||
|- | |- | ||
| Quadro P6000 | | Quadro P6000 | ||
| <code>p6000</code> | |||
| Pascal | | Pascal | ||
| 3840 | | 3840 | ||
Line 93: | Line 140: | ||
|- | |- | ||
| Tesla P100 | | Tesla P100 | ||
| <code>p100</code> | |||
| Pascal | | Pascal | ||
| 3584 | | 3584 | ||
Line 100: | Line 148: | ||
| n/a | | n/a | ||
|- | |- | ||
| GeForce GTX | | TITAN X (Pascal) | ||
| <code>titanxpascal</code> | |||
| Pascal | |||
| 3584 | |||
| 12GB GDDR5X | |||
| 480GB/s | |||
| 11.0 | |||
| n/a | |||
|- | |||
| TITAN Xp | |||
| <code>titanxp</code> | |||
| Pascal | |||
| 3840 | |||
| 12GB GDDR5X | |||
| 548GB/s | |||
| 12.1 | |||
| n/a | |||
|- | |||
| GeForce GTX TITAN X | |||
| <code>gtxtitanx</code> | |||
| Maxwell | | Maxwell | ||
| 3072 | | 3072 | ||
Line 106: | Line 173: | ||
| 336GB/s | | 336GB/s | ||
| 6.7 | | 6.7 | ||
| n/a | | n/a | ||
|- | |- | ||
|} | |} | ||
[0] - This GPU is actually a pair of two physical cards connected over [https://www.nvidia.com/en-us/data-center/nvlink NVLink] bridges. NVIDIA's provided specifications for this GPU type are for one physical card; to get these specs, we have hence doubled their advertised values. |
Latest revision as of 21:04, 6 November 2024
There are several different types of NVIDIA GPUs in the Nexus cluster that are available to be scheduled. They are listed below in order of newest to oldest architecture, and then alphabetically.
We do not list the exact quantities of GPU here since they change frequently due to additions to or removals from the cluster or during compute node troubleshooting. To see which compute nodes have which GPUs and in what quantities, use the show_nodes
command on a submission or compute node. The quantities are listed under the GRES column.
Name | GRES string (SLURM) | Architecture | CUDA Cores | GPU Memory | Memory Bandwidth | FP32 Performance (TFLOPS) | TF32 Performance (Dense / Sparse TOPS) |
---|---|---|---|---|---|---|---|
H100 NVLink [0] | h100-nvl
|
Hopper | 33792 | 188GB HBM3 | 7.87TB/s | 134 | not officially published/1671 |
H100 SXM | h100-sxm
|
Hopper | 16896 | 80GB HBM3 | 3.35TB/s | 67 | not officially published/989 |
L40S | l40s
|
Ada Lovelace | 18176 | 48GB GDDR6 | 864GB/s | 91.6 | 183/366 |
RTX 6000 Ada Generation | rtx6000ada
|
Ada Lovelace | 18176 | 48GB GDDR6 | 960GB/s | 91.1 | 182.1/364.2 |
A100 PCIe 80GB | a100
|
Ampere | 6912 | 80GB HBM2e | 1.94TB/s | 19.5 | 156/312 |
A100 SXM 80GB | a100
|
Ampere | 6912 | 80GB HBM2e | 2.04TB/s | 19.5 | 156/312 |
GeForce RTX 3070 | rtx3070
|
Ampere | 5888 | 8GB GDDR6 | 448GB/s | 20.3 | 20.3/40.6 |
GeForce RTX 3090 | rtx3090
|
Ampere | 10496 | 24GB GDDR6X | 936GB/s | 35.6 | 35.6/71 |
RTX A4000 | rtxa4000
|
Ampere | 6144 | 16GB GDDR6 | 448GB/s | 19.2 | not officially published |
RTX A5000 | rtxa5000
|
Ampere | 8192 | 24GB GDDR6 | 768GB/s | 27.8 | not officially published |
RTX A6000 | rtxa6000
|
Ampere | 10752 | 48GB GDDR6 | 768GB/s | 38.7 | 77.4/154.8 |
GeForce RTX 2080 Ti | rtx2080ti
|
Turing | 4352 | 11GB GDDR5X | 616GB/s | 13.4 | n/a |
GeForce GTX 1080 Ti | gtx1080ti
|
Pascal | 3584 | 11GB GDDR5X | 484GB/s | 11.3 | n/a |
Quadro P6000 | p6000
|
Pascal | 3840 | 24GB GDDR5X | 432GB/s | 12.6 | n/a |
Tesla P100 | p100
|
Pascal | 3584 | 16GB CoWoS HBM2 | 732GB/s | 9.3 | n/a |
TITAN X (Pascal) | titanxpascal
|
Pascal | 3584 | 12GB GDDR5X | 480GB/s | 11.0 | n/a |
TITAN Xp | titanxp
|
Pascal | 3840 | 12GB GDDR5X | 548GB/s | 12.1 | n/a |
GeForce GTX TITAN X | gtxtitanx
|
Maxwell | 3072 | 12GB GDDR5 | 336GB/s | 6.7 | n/a |
[0] - This GPU is actually a pair of two physical cards connected over NVLink bridges. NVIDIA's provided specifications for this GPU type are for one physical card; to get these specs, we have hence doubled their advertised values.