Lustre

There are two different Lustre filesystems corresponding to our data center locations to provide maximum local performance.

While they are physically close to each other, connections between the buildings have a higher latency and lower throughput.

The best performance is achieved using the Lustre filesystem in the same building, particularly for IOPS.

The Lustre filesystems are optimized for high input/output bandwidth from many nodes at the same time and a moderate number of large files, i.e. hot data that is actively used by compute jobs.

Lustre MDC

Located in the mobile data center and connected to Emmy Phase 2. It is replacing the old scratch-emmy / lustre-emmy system and will be available later in 2025. The total capacity is 1.6 PiB.

Lustre RZG

Located in the RZGö and connected to Grete and Emmy Phase 3. The total capacity is 509 TiB.

Performance

The best performance can be reached with sequential IO of large files that is aligned to the fullstripe size of the underlying RAID6 (1 MiB).

If you are accessing a large file (1+ GiB) from multiple nodes in parallel, please consider setting the striping of the file with the Lustre command lfs setstripe with a sensible stripe-count (recommend up to 32) and a stripe-size which is a multiple of the RAID6 fullstripe size (1 MiB) and matches the IO sizes of your job.

This can be done to a specific file or for a whole directory. But changes apply only for new files, so applying a new striping to an existing file requires a file copy.

An example of setting the stripe size and count is given below (run man lfs-setstripe for more information about the command).

lfs setstripe --stripe-size 1M --stripe-count 16 PATH