Nexus/GPUs: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
Line 11: | Line 11: | ||
! FP32 Performance (TFLOPS) | ! FP32 Performance (TFLOPS) | ||
! [https://developer.nvidia.com/blog/accelerating-ai-training-with-tf32-tensor-cores/ TF32] Performance ([https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt Dense / Sparse TOPS]) | ! [https://developer.nvidia.com/blog/accelerating-ai-training-with-tf32-tensor-cores/ TF32] Performance ([https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt Dense / Sparse TOPS]) | ||
| A100 | |||
| Ampere | |||
| 6912 | |||
| 80GB HBM2e | |||
| 1935 GB/s | |||
| 19.5 | |||
| 156/312 | |||
|- | |- | ||
| GeForce RTX 3070 | | GeForce RTX 3070 | ||
Line 51: | Line 58: | ||
| 38.7 | | 38.7 | ||
| 77.4/154.8 | | 77.4/154.8 | ||
|- | |- | ||
| GeForce RTX 2080 Ti | | GeForce RTX 2080 Ti |
Revision as of 22:04, 6 December 2023
There are several different types of NVIDIA GPUs in the Nexus cluster that are available to be scheduled. They are listed below in order of newest to oldest architecture, and then alphabetically.
We do not list the exact quantities of GPU here since they change frequently due to additions to or removals from the cluster or during compute node troubleshooting. To see which compute nodes have which GPUs and in what quantities, use the show_nodes
command on a submission or compute node. The quantities are listed under the GRES column.
Name | Architecture | CUDA Cores | GPU Memory | Memory Bandwidth | FP32 Performance (TFLOPS) | TF32 Performance (Dense / Sparse TOPS) | A100 | Ampere | 6912 | 80GB HBM2e | 1935 GB/s | 19.5 | 156/312 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
GeForce RTX 3070 | Ampere | 5888 | 8GB GDDR6 | 448GB/s | 20.3 | 20.3/40.6 | |||||||
GeForce RTX 3090 | Ampere | 10496 | 24GB GDDR6X | 936GB/s | 35.6 | 35.6/71 | |||||||
RTX A4000 | Ampere | 6144 | 16GB GDDR6 | 448GB/s | 19.2 | not officially published | |||||||
RTX A5000 | Ampere | 8192 | 24GB GDDR6 | 768GB/s | 27.8 | not officially published | |||||||
RTX A6000 | Ampere | 10752 | 48GB GDDR6 | 768GB/s | 38.7 | 77.4/154.8 | |||||||
GeForce RTX 2080 Ti | Turing | 4352 | 11GB GDDR5X | 616GB/s | 13.4 | n/a | |||||||
GeForce GTX 1080 Ti | Pascal | 3584 | 11GB GDDR5X | 484GB/s | 11.3 | n/a | |||||||
GeForce GTX Titan Xp | Pascal | 3840 | 12GB GDDR5X | 548GB/s | 12.1 | n/a | |||||||
Quadro P6000 | Pascal | 3840 | 24GB GDDR5X | 432GB/s | 12.6 | n/a | |||||||
Tesla P100 | Pascal | 3584 | 16GB CoWoS HBM2 | 732GB/s | 9.3 | n/a | |||||||
GeForce GTX Titan X | Maxwell | 3072 | 12GB GDDR5 | 336GB/s | 6.7 | n/a | |||||||
Tesla M40 | Maxwell | 3072 | 12GB GDDR5 | 288GB/s | 6.8 | n/a | |||||||
Tesla K80 | Kepler | 4992 | 24GB GDDR5 | 480GB/s | 8.7 | n/a |