CP2K

Description

CP2K is a package for atomistic simulations of solid state, liquid, molecular, and biological systems offering a wide range of computational methods with the mixed Gaussian and plane waves approaches.

More information about CP2K and the documentation are found on https://www.cp2k.org/

Availability

CP2K is freely available for all users under the GNU General Public License (GPL).

Modules

CP2K is an MPI-parallel application. Use mpirun when launching CP2K.

CP2K VersionModulefileRequirementSupportCPU / GPULise/Emmy
2022.2cp2k/2022.2intel/2021.2 (Lise)
intel/2022.2 (Emmy)
libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl, sirius, libvori and libbqb✅ / ❌✅ / ✅
2023.1cp2k/2023.1intel/2021.2 (Lise)
intel/2022.2 (Emmy)
Lise: libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl, sirius, libvori and libbqb.
Emmy: libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl and sirius.
✅ / ❌✅ / ✅
2023.1cp2k/2023.1openmpi/gcc.11/4.1.4
cuda/11.8
libint, fftw3, libxc, elpa, elpa_nvidia_gpu, scalapack, cosma, xsmm, dbcsr_acc, spglib, mkl, sirius, offload_cuda, spla_gemm, m_offloading, libvdwxc❌ / ✅✅ / ❌
2023.2cp2k/2023.2intel/2021.2
impi/2021.7.1
libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl, sirius, libvori and libbqb✅ / ❌✅ / ❌
2023.2cp2k/2023.2openmpi/gcc.11/4.1.4
cuda/11.8
libint, fftw3, libxc, elpa, elpa_nvidia_gpu, scalapack, cosma, xsmm, dbcsr_acc, spglib, mkl, sirius, offload_cuda, spla_gemm, m_offloading, libvdwxc❌ / ✅✅ / ❌

Remark: cp2k needs special attention when running on GPUs.

  1. You need to check if, for your problem, a considerable acceleration is expected. E.g., for the following test cases, a performance degradation has been reported: https://www.cp2k.org/performance:piz-daint-h2o-64, https://www.cp2k.org/performance:piz-daint-h2o-64-ri-mp2, https://www.cp2k.org/performance:piz-daint-lih-hfx, https://www.cp2k.org/performance:piz-daint-fayalite-fist

  2. GPU pinning is required (see the example of a job script below). Don’t forget to make executable the script that takes care of the GPU pinning. In the example, this is achieved with: chmod +x gpu_bind.sh

Using cp2k as a library

Starting from version 2023.2, cp2k has been compiled enabling the option that allows it to be used as a library: libcp2k.a can be found inside $CP2K_LIB_DIR. The header libcp2k.h is located in $CP2K_HEADER_DIR, and the module files (.mod), eventually needed by Fortran users, are in $CP2K_MOD_DIR.

For more details, please refer to the documentation.

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=cp2k
 
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
 
module load intel/2021.2 impi/2021.7.1 cp2k/2023.2
srun cp2k.psmp input > output
#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=cp2k
 
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} 
 
# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close
 
# Binding MPI tasks
export I_MPI_PIN=yes
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN_CELL=core
 
module load intel/2021.2 impi/2021.7.1 cp2k/2023.2
mpirun cp2k.psmp input > output
#!/bin/bash
#SBATCH --partition=gpu-a100 
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=18
#SBATCH --job-name=cp2k
 
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}   
export OMP_PLACES=cores
export OMP_PROC_BIND=close
 
module load gcc/11.3.0 openmpi/gcc.11/4.1.4 cuda/11.8 cp2k/2023.2
 
# gpu_bind.sh (see the following script) should be placed inside the same directory where cp2k will be executed
# Don't forget to make gpu_bind.sh executable by running: chmod +x gpu_bind.sh
mpirun --bind-to core --map-by numa:PE=${SLURM_CPUS_PER_TASK} ./gpu_bind.sh cp2k.psmp input > output
#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=cp2k
 
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
 
module load intel/2022.2 impi/2021.6 cp2k/2023.1
srun cp2k.psmp input > output
#!/bin/bash
export CUDA_VISIBLE_DEVICES=$OMPI_COMM_WORLD_LOCAL_RANK
$@

Depending on the problem size, it may happen that the code stops with a segmentation fault due to insufficient stack size or due to threads exceeding their stack space. To circumvent this, we recommend inserting in the jobscript:

export OMP_STACKSIZE=512M
ulimit -s unlimited