Too many open files

When using srun –propagate

A process started with “srun” using the “-–propagate” option fails with “Too many open files”. Since Slurm upgrade to version 21.

Slurm version 21 will run the compute process with a hard open file limit (RLIMIT_NOFILE) of only 4096. See also https://github.com/SchedMD/slurm/commit/18b2f4fff3f8fd5773ab14ec631bbd5f2995fa6e

Solution

Add NOFILE to –propagate. See also man 1 srun.

Example:

$ srun --propagate=STACK,NOFILE ...

instead of

$ srun --propagate=STACK ...