AlmaLinux 10 Testing Phase

Our upcoming hardware generation (planned to go online in the second half of 2026) will launch running AlmaLinux 10. Once it has been launched into production, the rest of the HPC compute and login nodes will also be upgraded to the new operating system.

This means we will be upgrading from the RHEL 8 compatible Rocky 8 straight to AlmaLinux 10, skipping Rocky/Alma 9, to reduce the number of potentially breaking changes affecting the users overall. We encourage all users to test their workloads to ensure a smooth transition and to benefit from the new hardware generation as soon as it is put into operation. In addition, we are grateful to receive feedback - both success stories and any obstacles encountered - via the ticket system (please include “AlmaLinux 10” or just “Alma 10” in the subject of your ticket).

Test Nodes

We have prepared test nodes that are already running AlmaLinux 10. These nodes are open to powerusers at first to test their codes and provide technical feedback until we widen the scope of the testing phase to all users. We will send a separate announcement when the test nodes open for regular users as well, so everyone has the chance to make sure their problems are addressed before the new operating system is rolled out cluster-wide (missing essential software, MPI problems, etc.). The test nodes have their own partitions which are:

Partition	User groups	Hardware
`standard96:el10`	NHR, SCC	10 Emmy Phase 2 nodes with SSDs
`medium96s:el10`	NHR, SCC	4 Emmy Phase 3 nodes
`grete:el10`	KISSKI, NHR, SCC	2 Grete nodes, each having 4 x A100 80 GiB VRAM

NHR and KISSKI jobs on these partitions will be “free of charge” (will be accounted as 0 core-hours) during the test phase.

These nodes are shared so multiple users can try them out at once. When the time to upgrade the rest of the system draws closer, more test nodes will be made available.

The Emmy P3 login node glogin13.hpc.gwdg.de is running the new operating system already. It has been removed from the alias glogin-p3.hpc.gwdg.de for the duration of the testing phase.

Please note that the nodes will likely be rebooted quite often, as new software or features are added, and their configurations are fixed and/or adjusted.

Software Changes

Software highlights of the AlmaLinux 10 upgrade:

System GCC 8 ⇒ 14
glibc 2.28 ⇒ 2.39
System Python 3.6 ⇒ 3.12
Linux kernel (with heavy backports) 4.18 ⇒ 6.12
OS packages compiled for x86-64-v1 ⇒ x86-64-v3 (AVX2 baseline)

A new software revision is available on the test nodes using the module system. Some of the highlights:

Updates to many packages
Main software compiled with
- GCC 13.4 (GPU nodes)
- GCC 15.2 (CPU Nodes)
- Intel OneAPI Compilers 2025.3.2 (Intel CPU Nodes)
- AMD Optimizing Compilers (AOCC) 5.1.0 (AMD CPU Nodes)
MPI
- Intel OneAPI
  - Update 2021.14.0 ⇒ 2021.17.2
- OpenMPI
  - Update 4.1.7 ⇒ 4.1.8 as well as 5.0.10
  - New OPX provider
    - Required by the upcoming hardware generation
    - Better latency and bandwidth for small transfers
    - Slight degradation of bandwidth for large transfers
Python 3.13.13
CUDA 12.9.2 and 13.2.1
Numerical Libraries
- Intel MKL 2025.3.1
- AMD BLIS, libflame, FFTW, Scalapack 5.3
- OpenBLAS 0.3.33

FUSE and Mount Namespaces

On the login nodes, each user has their own private mount namespace shared between all their sessions. On the compute nodes, each slurm job runs in its own private mount namespace. Each of these private mount namespaces starts with initially-empty /tmp, /var/tmp, and /dev/shm just for the user/job. FUSE is now setup to allow FUSE mounts onto any directory you have write permission to, without the need to first enter a user namespace using unshare. The nodes have several packages for mounting various things via FUSE ready to go:

ratarmount for mounting many forms of archives (tar, zip, 7z, SquashFS, bind mounting directories, FAT filesystems, ext4, etc.)
bindfs for bind-mounting directories to other directories
erofsfuse for erofs
fuse2fs for ext2/3/4
fuse-overlayfs for OverlayFS
squashfuse for SquashFS
sshfs
s3fs

There are three important caveats/issues with FUSE you must be careful with:

Note

Do not mount on directories that are under a network filesystem mounted by NFS (all HOME directories, and the PROJECT directories of NHR and KISSKI) because you cannot unmount them cleanly with fusermount -u due to root_squash on the NFS.

Note

On the login nodes, FUSE mounts that are still running keep your login session alive. This persistence of the FUSE mount can be very useful as it lets you continue using it in other sessions, but it means you must manually unmount them to properly close the session. As each login node has a strict limit on the number of login sessions per user (currently 32), be careful to not use up your whole limit preventing you from logging in anymore.

Warning

Job termination will hard close FUSE mounts in the job’s namespace, giving no chance for a clean unmount.

For read-only mounts, this doesn’t matter. But for read-write mounts, this could lead to corruption of the filesystem you mounted. Make sure to include the unmount commands at the end of your jobs! You can use #SBATCH --signal=... and a signal handler/trap to handle unmounting a specified time before the job would terminate due to reaching its walltime limit.

An example job script setup to receive a SIGUSR1 (signal 10) 300 seconds before running out of time would be:

...
#SBATCH --signal=B:10@300

...

# Do FUSE mounts at /tmp/foo/bar and /tmp/this/that
...

trap SIGUSR1 'pkill -15 -f my-computation.sh ; fusermount -u /tmp/foo/bar ; fusermount -u /tmp/this/that'

./my-computation.sh [ARGS]

fusermount -u /tmp/foo/bar
fusermount -u /tmp/this/that