GPU Partitions
Nodes in these partitons provide GPUs for parallelizing calculations. See GPU Usage for more details on how to use GPU partitions, particularly those where GPUs are split into MiG slices.
Partitions
The partitions are listed in the table below without hardware details.
Cluster | Partition | OS | Shared | Max. walltime | Max. nodes per job | Core-hours per node |
---|---|---|---|---|---|---|
NHR | grete | Rocky 8 | 48 hr | 16 | 600 | |
grete:shared | Rocky 8 | yes | 48 hr | 1 | 150 per GPU | |
grete:interactive | Rocky 8 | yes | 48 hr | 1 | 150/47 per GPU/slice | |
grete:preemptible | Rocky 8 | yes | 48 hr | 1 | 150/47 per GPU/slice | |
grete-h100 | Rocky 8 | 48 hr | 16 | 1050 | ||
grete-h100:shared | Rocky 8 | yes | 48 hr | 16 | 262.5 | |
kisski | Rocky 8 | 48 hr | 16 | 600 | ||
kisski-h100 | Rocky 8 | 48 hr | 16 | 1050 | ||
react | Rocky 8 | 48 hr | 16 | 600 | ||
scc-a100 | Rocky 8 | 48 hr | 16 | 600 | ||
jupyter:gpu (jupyter) | Rocky 8 | yes | 24 hr | 1 | 150/47 per GPU/slice | |
SCC | gpu | Rocky 8 | yes | 48 hr | max | 24 per GPU |
gpu-int (jupyter) | Rocky 8 | yes | 48 hr | max | 5 per GPU | |
vis | Rocky 8 | yes | 48 hr | max | 5 per GPU |
The partitions you are allowed to use may be restricted by your kind of account.
For example, only KISSKI users can use the kisski
and kisski-h100
partitions.
JupyterHub sessions run on the partitions marked with jupyter in the table above.
Three GPU nodes featuring 4x A100 80GB cards are not directly accessible from SCC for technical reasons, but are instead connected in the NHR cluster temporarily (scc-a100
partition).
SCC users wanting to use them should request a project in our HPC Project Portal to access these nodes by emailing hpc-support@gwdg.de.
The nodes can be accessed via the NHR login nodes and they use the NHR software stacks as well as our new VAST filesystem.
The NHR nodes are grouped into subclusters of different hardware.
The partitions for each NHR subcluster are listed in the table below where the parts of partition names in square brackets are optional (e.g. grete-h100[:*]
includes grete-h100
and grete-h100:shared
).
NHR Subcluster | GPUs | Partitions |
---|---|---|
Grete Phase 1 | Nvidia V100 | jupyter:gpu |
Grete Phase 2 | Nvidia A100 | grete[:*] , kisski , react , scc-a100 |
Grete Phase 3 | Nvidia H100 | grete-h100[:*] , kisski-h100 |
The hardware for the different nodes in each partition are listed in the table below. Note that some partitions are heterogeneous, having nodes with different hardware. Additionally, many nodes are in more than one partition.
Partition | Nodes | GPU + slices | VRAM each | CPU | RAM per node | Cores |
---|---|---|---|---|---|---|
grete | 27 | 4 × Nvidia A100 | 40 GB | 2 × Zen3 EPYC 7513 | 497 810 MB | 64 |
25 | 4 × Nvidia A100 | 80 GB | 2 × Zen3 EPYC 7513 | 497 810 MB | 64 | |
grete:shared | 33 | 4 × Nvidia A100 | 40 GB | 2 × Zen3 EPYC 7513 | 497 810 MB | 64 |
22 | 4 × Nvidia A100 | 80 GB | 2 × Zen3 EPYC 7513 | 497 810 MB | 64 | |
5 | 4 × Nvidia A100 | 80 GB | 2 × Zen3 EPYC 7513 | 1 013 000 MB | 64 | |
2 | 8 × Nvidia A100 | 80 GB | 2 × Zen2 EPYC 7662 | 1 013 620 MB | 128 | |
grete:interactive | 3 | 4 × Nvidia A100 (2g.10gb and 3g.20gb) | 10/20 GB | 2 × Zen3 EPYC 7513 | 497 810 MB | 64 |
grete:preemptible | 3 | 4 × Nvidia A100 (2g.10gb and 3g.20gb) | 10/20 GB | 2 × Zen3 EPYC 7513 | 497 810 MB | 64 |
grete-h100 | 5 | 4 × Nvidia H100 | 94 GB | 2 × Xeon Platinum 8468 | 1 029 200 MB | 96 |
grete-h100:shared | 5 | 4 × Nvidia H100 | 94 GB | 2 × Xeon Platinum 8468 | 1 029 200 MB | 96 |
kisski | 59 | 4 × Nvidia A100 | 80 GB | 2 × Zen3 EPYC 7513 | 497 810 MB | 64 |
4 | 4 × Nvidia A100 | 80 GB | 2 × Zen3 EPYC 7513 | 1 013 000 MB | 64 | |
kisski-h100 | 11 | 4 × Nvidia H100 | 94 GB | 2 × Xeon Platinum 8468 | 1 029 200 MB | 96 |
react | 22 | 4 x Nvidia A100 | 80 GB | 2 × Zen3 EPYC 7513 | 497 810 MB | 64 |
scc-a100 | 2 | 4 × Nvidia A100 | 80 GB | 2 × Zen3 EPYC 7513 | 1 013 000 MB | 64 |
1 | 4 × Nvidia A100 | 80 GB | 2 × Zen3 EPYC 7513 | 497 810 MB | 64 | |
jupyter:gpu | 3 | 4 × Nvidia V100 | 32 GB | 2 × Skylake 6148 | 746 000 MB | 40 |
gpu | 2 | 8 × Nvidia V100 | 32 GB | 2 × Cascadelake 6252 | 362 000 MB | 48 |
13 | 4 × Nvidia RTX5000 | 16 GB | 2 × Cascadelake 6242 | 172 000 MB | 32 | |
3 | 4 × Nvidia GTX1080 | 8 GB | 2 × Broadwell E5-2650v4 | 128 000 MB | 24 | |
gpu-int | 2 | 4 × Nvidia GTX980 | 4 GB | 2 × Broadwell E5-2650v4 | 128 000 MB | 24 |
vis | 3 | 4 × Nvidia GTX980 | 4 GB | 2 × Broadwell E5-2650v4 | 128 000 MB | 24 |
The CPUs and GPUs
For partitions that have heterogeneous hardware, you can give Slurm options to request the particular hardware you want.
For CPUs, you can specify the kind of CPU you want by passing a -C/--constraint
option to slurm to get the CPUs you want.
For GPUs, you can specify the name of the GPU when you pass the -G
/--gpus
option (or --gpus-per-task
) and larger VRAM using a -C/--constraint
option.
See Slurm and GPU Usage for more information.
The GPUs, the options to request them, and some of their properties are given in the table below.
GPU | VRAM | FP32 cores | Tensor cores | -G option | -C option | Compute Cap. |
---|---|---|---|---|---|---|
Nvidia A100 | 40 GB | 6912 | 432 | A100 | 80 | |
80 GB | 6912 | 432 | A100 | 80gb | 80 | |
Nvidia H100 | 94 GB | 8448 | 528 | H100 | 96gb | 90 |
2g.10gb slice of Nvidia A100 | 10 GB | 1728 | 108 | 2g.10gb | 80 | |
3g.20gb slice of Nvidia A100 | 20 GB | 2592 | 162 | 3g.20gb | 80 | |
Nvidia V100 | 32 GB | 5120 | 640 | V100 | 70 | |
Nvidia Quadro RTX 5000 | 16 GB | 3072 | 384 | RTX5000 | 75 | |
Nvidia GeForce GTX 1080 | 8 GB | 2560 | GTX1080 | 61 | ||
Nvidia GeForce GTX 980 | 4 GB | 2048 | GTX980 | 52 |
The CPUs, the options to request them, and some of their properties are give in the table below.
CPU | Cores | -C option | Architecture |
---|---|---|---|
AMD Zen3 EPYC 7513 | 32 | zen3 or milan | zen3 |
AMD Zen2 EPYC 7662 | 64 | zen2 or rome | zen2 |
Intel Sapphire Rapids Xeon Platinum 8468 | 48 | sapphirerapids | sapphirerapids |
Intel Cascadelake Xeon Gold 6252 | 24 | cascadelake | cascadelake |
Intel Cascadelake Xeon Gold 6242 | 16 | cascadelake | cascadelake |
Intel Skylake Xeon Gold 6148 | 20 | skylake | skylake_avx512 |
Intel Broadwell Xeon E5-2650 V4 | 12 | broadwell | broadwell |
Hardware Totals
The total nodes, cores, GPUs, RAM, and VRAM for each cluster and sub-cluster are given in the table below.
Cluster | Sub-cluster | Nodes | GPUs | VRAM (TiB) | Cores | RAM (TiB) |
---|---|---|---|---|---|---|
NHR | Grete Phase 1 | 3 | 12 | 0.375 | 120 | 2.1 |
Grete Phase 2 | 103 | 420 | 27.1 | 6,720 | 47.6 | |
Grete Phase 3 | 16 | 64 | 6.0 | 1,536 | 15.7 | |
TOTAL | 122 | 496 | 33.5 | 8,376 | 65.4 | |
SCC | TOTAL | 21 | 92 | 1.4 | 656 | 3.5 |