Hybrid MPI + OpenMP
Often, Message Passing Interface (MPI) and OpenMP (Open Multi-Processing) are used together to make hybrid jobs, using MPI to parallelize between nodes and OpenMP within nodes. This means codes must be compiled with both, carefully launched with Slurm to set the number of tasks and cores per task correctly.
Example Code
Here is an example code using both:
#include <stdio.h>
#include <omp.h>
#include <mpi.h>
int main(int argc, char** argv)
{
// Initialize the MPI environment
MPI_Init(NULL, NULL);
// Get the number of processes
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
// Get the rank of the process
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
// Get the name of the processor
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, &name_len);
int nthreads, tid;
// Fork a team of threads giving them their own copies of variables
#pragma omp parallel private(nthreads, tid)
{
// Obtain thread number
tid = omp_get_thread_num();
printf("Hello World from thread = %d, processor %s, rank %d out of %d processors\n", tid, processor_name, world_rank, world_size);
// Only primary thread does this
if (tid == 0)
{
nthreads = omp_get_num_threads();
printf("Number of threads = %d\n", nthreads);
}
} // All threads join primary thread
// Finalize the MPI environment.
MPI_Finalize();
}
Compilation
To compile it, load the compiler and MPI module and then essentially combine the MPI compiler wrappers (see MPI) with the OpenMP compiler options (see OpenMP). For the example above, you would do
module load gcc
module load openmpi
mpicc -fopenmp -o hybrid_hello_world.bin hybrid_hello_world.c
module load intel-oneapi-compilers
module load intel-oneapi-mpi
mpiicx -qopenmp -o hybrid_hello_world.bin hybrid_hello_world.c
module load intel-oneapi-compilers
module load openmpi
mpicc -qopenmp -o hybrid_hello_world.bin hybrid_hello_world.c
Batch Job
When submitting the batch job, you have to decide how separate MPI processes you want to run per node (tasks) and how many cores for each (usually the number on the node divided by the number of tasks per node). The best way to do this is to explicitly set
-N <nodes>
for the number of nodes--tasks-per-node=<tasks-per-node>
for the number of separate MPI processes you want on each node-c <cores-per-task>
if you want to specify the number of cores per task (if you leave it out, it will evenly divide them)
and then run the code in the jobscript with mpirun
, which will receive all the required information from Slurm (we do not recommend using srun
).
If we run the example above using two nodes where each node runs 2 tasks each that uses all cores (but not hypercores), one would use the following job script:
#!/bin/bash
#SBATCH --time=00:10:00
#SBATCH --nodes=2
#SBATCH --tasks-per-node=2
#SBATCH --partition=standard96:test
module load gcc
module load openmpi
export OMP_NUM_THREADS=$(( $SLURM_CPUS_ON_NODE / $SLURM_NTASKS_PER_NODE / 2 ))
mpirun ./hybrid_hello_world.bin
#!/bin/bash
#SBATCH --time=00:10:00
#SBATCH --nodes=2
#SBATCH --tasks-per-node=2
#SBATCH --partition=standard96:test
module load intel-oneapi-compilers
module load intel-oneapi-mpi
export OMP_NUM_THREADS=$(( $SLURM_CPUS_ON_NODE / $SLURM_NTASKS_PER_NODE / 2 ))
mpirun ./hybrid_hello_world.bin
#!/bin/bash
#SBATCH --time=00:10:00
#SBATCH --nodes=2
#SBATCH --tasks-per-node=2
#SBATCH --partition=standard96:test
module load intel-oneapi-compilers
module load openmpi
export OMP_NUM_THREADS=$(( $SLURM_CPUS_ON_NODE / $SLURM_NTASKS_PER_NODE / 2 ))
mpirun ./hybrid_hello_world.bin