Floating point exception with Intel MPI 2019.x using one task per node
Problem
When using Intel MPI 2019 (impi/2019.x) to start MPI jobs executing one task per node, the jobs aborts immediately with an
srun: error: gcnXXXX: task 0: Floating point exception
srun: error: gcnYYYY: task 1: Floating point exception
[...]
Solution
This is due to a problem with Intel’s Hydra process manager. A workaround is to assign the environment variable I_MPI_HYDRA_TOPOLIB the value ipl (internal default value hwloc).
#!/bin/bash
#SBATCH [...]
export I_MPI_HYDRA_TOPOLIB=ipl
[...]