GPU Partitions

Nodes in these partitons provide GPUs for parallelizing calculations. See GPU Usage for more details on how to use GPU partitions, particularly those where GPUs are split into MiG slices.

Partitions

The partitions are listed in the table below without hardware details.

ClusterPartitionOSSharedMax. walltimeMax. nodes per jobCore-hours per node
NHRgreteRocky 848 hr16600
grete:sharedRocky 8yes48 hr1150 per GPU
grete:interactiveRocky 8yes48 hr1150/47 per GPU/slice
grete:preemptibleRocky 8yes48 hr1150/47 per GPU/slice
grete-h100Rocky 848 hr161050
grete-h100:sharedRocky 8yes48 hr16262.5
kisskiRocky 848 hr16600
kisski-h100Rocky 848 hr161050
reactRocky 848 hr16600
scc-a100Rocky 848 hr16600
jupyter:gpu
(jupyter)
Rocky 8yes24 hr1150/47 per GPU/slice
SCCgpuRocky 8yes48 hrmax24 per GPU
gpu-int
(jupyter)
Rocky 8yes48 hrmax5 per GPU
visRocky 8yes48 hrmax5 per GPU
Warning

The partitions you are allowed to use may be restricted by your kind of account. For example, only KISSKI users can use the kisski and kisski-h100 partitions.

Info

JupyterHub sessions run on the partitions marked with jupyter in the table above.

Info

Three GPU nodes featuring 4x A100 80GB cards are not directly accessible from SCC for technical reasons, but are instead connected in the NHR cluster temporarily (scc-a100 partition). SCC users wanting to use them should request a project in our HPC Project Portal to access these nodes by emailing hpc-support@gwdg.de. The nodes can be accessed via the NHR login nodes and they use the NHR software stacks as well as our new VAST filesystem.

The NHR nodes are grouped into subclusters of different hardware. The partitions for each NHR subcluster are listed in the table below where the parts of partition names in square brackets are optional (e.g. grete-h100[:*] includes grete-h100 and grete-h100:shared).

NHR SubclusterGPUsPartitions
Grete Phase 1Nvidia V100jupyter:gpu
Grete Phase 2Nvidia A100grete[:*], kisski, react, scc-a100
Grete Phase 3Nvidia H100grete-h100[:*], kisski-h100

The hardware for the different nodes in each partition are listed in the table below. Note that some partitions are heterogeneous, having nodes with different hardware. Additionally, many nodes are in more than one partition.

PartitionNodesGPU + slicesVRAM eachCPURAM per nodeCores
grete274 × Nvidia A10040 GB2 × Zen3 EPYC 7513497 810 MB64
254 × Nvidia A10080 GB2 × Zen3 EPYC 7513497 810 MB64
grete:shared334 × Nvidia A10040 GB2 × Zen3 EPYC 7513497 810 MB64
224 × Nvidia A10080 GB2 × Zen3 EPYC 7513497 810 MB64
54 × Nvidia A10080 GB2 × Zen3 EPYC 75131 013 000 MB64
28 × Nvidia A10080 GB2 × Zen2 EPYC 76621 013 620 MB128
grete:interactive34 × Nvidia A100
(2g.10gb and 3g.20gb)
10/20 GB2 × Zen3 EPYC 7513497 810 MB64
grete:preemptible34 × Nvidia A100
(2g.10gb and 3g.20gb)
10/20 GB2 × Zen3 EPYC 7513497 810 MB64
grete-h10054 × Nvidia H10094 GB2 × Xeon Platinum 84681 029 200 MB96
grete-h100:shared54 × Nvidia H10094 GB2 × Xeon Platinum 84681 029 200 MB96
kisski594 × Nvidia A10080 GB2 × Zen3 EPYC 7513497 810 MB64
44 × Nvidia A10080 GB2 × Zen3 EPYC 75131 013 000 MB64
kisski-h100114 × Nvidia H10094 GB2 × Xeon Platinum 84681 029 200 MB96
react224 x Nvidia A10080 GB2 × Zen3 EPYC 7513497 810 MB64
scc-a10024 × Nvidia A10080 GB2 × Zen3 EPYC 75131 013 000 MB64
14 × Nvidia A10080 GB2 × Zen3 EPYC 7513497 810 MB64
jupyter:gpu34 × Nvidia V10032 GB2 × Skylake 6148746 000 MB40
gpu28 × Nvidia V10032 GB2 × Cascadelake 6252362 000 MB48
134 × Nvidia RTX500016 GB2 × Cascadelake 6242172 000 MB32
34 × Nvidia GTX10808 GB2 × Broadwell E5-2650v4128 000 MB24
gpu-int24 × Nvidia GTX9804 GB2 × Broadwell E5-2650v4128 000 MB24
vis34 × Nvidia GTX9804 GB2 × Broadwell E5-2650v4128 000 MB24

The CPUs and GPUs

For partitions that have heterogeneous hardware, you can give Slurm options to request the particular hardware you want. For CPUs, you can specify the kind of CPU you want by passing a -C/--constraint option to slurm to get the CPUs you want. For GPUs, you can specify the name of the GPU when you pass the -G/--gpus option (or --gpus-per-task) and larger VRAM using a -C/--constraint option. See Slurm and GPU Usage for more information.

The GPUs, the options to request them, and some of their properties are given in the table below.

GPUVRAMFP32 coresTensor cores-G option-C optionCompute Cap.
Nvidia A10040 GB6912432A10080
80 GB6912432A10080gb80
Nvidia H10094 GB8448528H10096gb90
2g.10gb slice of Nvidia A10010 GB17281082g.10gb80
3g.20gb slice of Nvidia A10020 GB25921623g.20gb80
Nvidia V10032 GB5120640V10070
Nvidia Quadro RTX 500016 GB3072384RTX500075
Nvidia GeForce GTX 10808 GB2560GTX108061
Nvidia GeForce GTX 9804 GB2048GTX98052

The CPUs, the options to request them, and some of their properties are give in the table below.

CPUCores-C optionArchitecture
AMD Zen3 EPYC 751332zen3 or milanzen3
AMD Zen2 EPYC 766264zen2 or romezen2
Intel Sapphire Rapids Xeon Platinum 846848sapphirerapidssapphirerapids
Intel Cascadelake Xeon Gold 625224cascadelakecascadelake
Intel Cascadelake Xeon Gold 624216cascadelakecascadelake
Intel Skylake Xeon Gold 614820skylakeskylake_avx512
Intel Broadwell Xeon E5-2650 V412broadwellbroadwell

Hardware Totals

The total nodes, cores, GPUs, RAM, and VRAM for each cluster and sub-cluster are given in the table below.

ClusterSub-clusterNodesGPUsVRAM (TiB)CoresRAM (TiB)
NHRGrete Phase 13120.3751202.1
Grete Phase 210342027.16,72047.6
Grete Phase 316646.01,53615.7
TOTAL12249633.58,37665.4
SCCTOTAL21921.46563.5