Apptainer (formerly Singularity)

Description

Apptainer (formerly Singularity) is a free, cross-platform and open-source computer program that performs operating-system-level virtualization also known as containerization. One of the main uses of Apptainer is to bring containers and reproducibility to scientific computing and the high-performance computing (HPC) world.

The need for reproducibility requires the ability to use containers to move applications from system to system.

Using Apptainer containers, developers can work in reproducible environments of their choosing and design, and these complete environments can easily be copied and executed on other platforms.

To learn more about Apptainer itself please consult the Apptainer documentation.

Module

Load the modulefile

$ module load apptainer

This provides access to the apptainer executable, which can be used to download, build and run containers.

Building Apptainer images and running them as containers

On NHR you can build Apptainer images directly on the login nodes.

For SCC you should use a compute node. For example by starting an interactive job:

$ srun --partition int --pty bash
$ module load apptainer

If you have written a container definition foo.def you can create an Apptainer image foo.sif (sif meaning Singularity Image File) in the following way:

$ module load apptainer
$ apptainer build foo.sif foo.def

For writing container definitions see the official documentation

Example Jobscripts

Here is an example job of running the local Apptainer image (.sif)

Running Apptainer container
#!/bin/bash
#SBATCH -p medium40
#SBATCH -N 1
#SBATCH -t 60:00
 
module load apptainer
 
apptainer run --bind /mnt,/local,/user,/projects,$HOME,$WORK,$TMPDIR,$PROJECT  $HOME/foo.sif
#!/bin/bash
#SBATCH -p grete:shared
#SBATCH -N 1
#SBATCH -G 1
#SBATCH -t 60:00
 
module load cuda
module load apptainer
 
apptainer run --nv --bind /mnt,/local,/user,/projects,$HOME,$WORK,$TMPDIR,$PROJECT  $HOME/foo.sif
#!/bin/bash
#SBATCH -p medium
#SBATCH -N 1
#SBATCH -c 8
#SBATCH -t 1:00:00
 
module load apptainer
 
apptainer run --bind /mnt,/local,/user,/projects,/home,/scratch,/scratch-scc,$HOME  $HOME/foo.sif
Note

If you have installed or loaded extra software before running the container it can conflict with the containerized software. This can for example happen if $PERL5LIB is set after installing your own Perl modules.

In that case you can try to add the --cleanenv option to the apptainer command line or in extreme cases run the apptainer without bind-mounting your home directory (--no-home) with the current working directory (cd [...]) set to a clean directory.

An even better permanent solution would be to make sure that your software has to be explicitly loaded, e.g. clean up ~/.bashrc and remove any conda initialization routines, only set $PERL5LIB explicitly via a script you can source, and only install python packages in virtual environments. (This is strongly recommended in general even if you don’t work with apptainer.)


Examples

Several examples of Apptainer use cases will be shown below.

Jupyter with Apptainer

As an advanced example, you can pull and deploy the Apptainer image containing Jupyter.

  1. Create a New Directory
    Create a new folder in your $HOME directory and navigate to this directory.

  2. Pull the Container Image
    Pull a container image using public registries such as DockerHub. Here we will use a public image from quay.io, quay.io/jupyter/minimal-notebook. For a quicker option, consider building the container locally or loading it from DockerHub.

    To pull the image, use the following command:

    apptainer pull jupyter.sif docker://quay.io/jupyter/minimal-notebook

    Don’t forget to run module load apptainer

  3. Submit the Job
    Once the jupyter.sif image is ready, you can submit the corresponding job to interact with the container. Start an interactive job (for more information, see the documentation on Interactive Jobs), start the container, and then a Jupyter instance inside of it. Take note of the node your interactive job has been allocated (or use the command hostname), we will need this in the next step.

    srun -p jupyter --pty -n 1 -c 8 bash
    srun: job 10592890 queued and waiting for resources
    srun: job 10592890 has been allocated resources
    u12345@ggpu02 ~ $ <- here you can see the allocated node
    module load apptainer

    Now, for a Jupyter container (here we are using one of the containers also used for Jupyter-HPC, but you can point to your own container):

    apptainer exec --bind /mnt,/sw,/user,/user_datastore_map,/projects /sw/viz/jupyterhub-nhr/jupyter-containers/jupyter.sif jupyter notebook --no-browser

    For an RStudio container:

    apptainer exec --bind /mnt,/sw,/user,/user_datastore_map,/projects /sw/viz/jupyterhub-nhr/jupyter-containers/rstudio.sif jupyter-server

    (the bind paths might not be necessary or might be different depending on what storage you need access to).

    Both of these will produce a long output, but the relevant part is a line that includes the address of the spawned Jupyter or Rstudio server process. It will look something like the following (we will need it in the next point, so make a note of it!):

    http://localhost:8888/rstudio?token=c85db148d85777f7d0c2e2df876351ff19872c593027b4a2

    The port used will usually be 8888, make a note of it if it’s a different value.

  4. Accessing the Jupyter Notebook In order to access the notebook you need to port-forward the port of the Jupyter server process from the compute node to your local workstation. Open another shell on your local workstation and run the following SSH command:

    ssh -NL 8888:localhost:8888 -o ServerAliveInterval=60 -i YOUR_PRIVATE_KEY -J LOG_IN_NODE -l YOUR_HPC_USER HOSTNAME

    Replace HOSTNAME with the value returned by hostname earlier/the name of the allocated node, YOUR_PRIVATE_KEY with the path to your private ssh key used to access the HPC, YOUR_HPC_USER with your username on the HPC, and the domain name of the log-in node you regularly use in place of LOG_IN_NODE. An example of this might look like:

    ssh -NL 8888:localhost:8888 -o ServerAliveInterval=60 -i ~/.ssh/id-rsa -J glogin-p2.hpc.gwdg.de -l u12345 ggpu02

    While your job is running, you can now access the spawned Jupyter server process via http://localhost:8888/ in a browser on your local computer (or paste the full address we obtained in the previous step).

NVIDIA GPU Access Within the Container

See also GPU Usage

Many popular python applications using GPUs will just pull in the CUDA/cuDNN runtimes as a dependency, nothing special is required when building them:

Bootstrap: docker
From: condaforge/miniforge3

%environment
        export TINI_SUBREAPER=true

%post
    conda install --quiet --yes notebook jupyterlab jupyterhub ipyparallel
    pip install torch torchvision torchaudio

Save this as pytorch.def then build the image via

module load apptainer
apptainer build pytorch.sif pytorch.def

You can test the image on one of the GPU partitions. The nvidia-smi command should display some information about the allocated GPU or GPU slice. The GPU drivers are bound into the container because of the --nv option.

Running Apptainer container
srun --pty -p grete:interactive -G 1g.10gb bash -c \
    "module load apptainer; apptainer shell --nv pytorch.sif"

nvidia-smi

Note: For a real compute job you should use a non-interactive partition and specify what type of GPU you want. The interactive partition is just used here for quick testing.

srun --pty -p jupyter -G 1 bash -c \
    "module load apptainer; apptainer shell --nv pytorch.sif"

nvidia-smi

Note: For a real compute job you should use the scc-gpu partition and specify what type of GPU you want. The interactive jupyter partition is just used here for quick testing.

You can also verify that cuda works via PyTorch by running the following:

python

import torch
torch.cuda.is_available()
torch.cuda.device_count()

This should return true and a number greater than 0.

You can also use this container with our JupyterHPC service, since we installed all the necessary additional packages with our recipe.

Hints for Advanced Users

If you want to build your own CUDA software from source, you can for example start with a container that already has the CUDA and cuDNN runtimes installed on the system level:

Bootstrap: docker
From: nvidia/cuda:12.8.0-cudnn-runtime-ubuntu24.04

%post
    [...]

When running the container don’t forget the --nv option.

It might also be possible to re-use the hosts CUDA runtime from the module system. The necessary environment manipulations should be similar to the following:

export PATH=$PATH:$CUDA_MODULE_INSTALL_PREFIX/bin
export LD_LIBRARY_PATH=$CUDA_MODULE_INSTALL_PREFIX/lib64:$CUDNN_MODULE_INSTALL_PREFIX/lib:$LD_LIBRARY_PATH
export CUDA_PATH=$CUDA_MODULE_INSTALL_PREFIX
export CUDA_ROOT=$CUDA_MODULE_INSTALL_PREFIX

Since this method requires --bind mounting software from the host system into the container it will stop the container from working on any other systems and is not recommended.


Tools in Containers

Multiple tools, libraries and programs are now provided as apptainer containers in /sw/container under various categories, such as /sw/container/bioinformatics. These containers allow us to provide tools that might otherwise prove difficult to install with our main package handling framework Spack. They also provide good examples for users who might want to set up their own containers.

You can get an overview by executing ls -l /sw/container/bioinformatics.

To use these on the SCC system you can use the following in your job scripts, e.g. to use the latest installed ipyrad version with 16 CPU cores:

module load apptainer

apptainer run --bind /mnt,/local,/user,/projects,/home,/scratch,/scratch-scc,$HOME \
    /sw/container/bioinformatics/ipyrad-latest.sif \
    ipyrad -p params-cool-animal.txt -s 1234 -c 16

You can do this in an interactive session, or a SLURM script. We do not recommend executing heavy containers on log-in nodes.

Currently we offer containers for:

  • Bioinformatics: agat, cutadapt, deeptools, homer, ipyrad, macs3, meme, minimap2, multiqc, Trinity, or umi_tools.
  • Quantum Simulators: see the corresponding page
  • JupyterHub containers, including RStudio. These can also be accessed through the JupyterHub service, or run directly like any other Apptainer container as explained in this page.

In the future these containers will also appear as pseudo-modules, for discoverability and visibility purposes.

Tip

The \ character at the end of the line (make sure there are no following spaces or tabs) signifies that the command continues in the next line. This can be used to visually separate the apptainer specific part of the command (apptainer … .sif) from the application specific part (ipyrad … 16) for better readability.


Slurm in Containers

Usually, you would submit a Slurm job and start your apptainer container within that, as we have seen in the previous examples. Sometimes, however, its the other way around, and you want to access Slurm (get cluster information or submit jobs) from within the container, the most notable example probably being Jupyter containers which you access interactively. Getting access to Slurm and its commands from within the container is not very difficult and only requires a few steps.

Firstly, your container should be based on Enterprise Linux (i.e. Rockylinux, Almalinux, CentOS, Redhat, Oracle Linux, etc.) in version 8 or 9, as those two are currently the only operating systems for which we provide Slurm. Running Slurm commands on a (for example) Debian based container following the steps below might work to an extent, but it is quite possible that required library versions differ between the two distributions. Therefore, such setups are not supported.

Next, during the container build phase, you’ll need to add the slurm user and its group, with a UID and GID of 450, e.g.

addgroup --system --gid 450 slurm
adduser --system --uid 450 --gid 450 --home /var/lib/slurm slurm

If you plan to use Job submission commands like srun, salloc or sbatch, you’ll also need to install the Lua library, as those commands contain a lua-based preprocessor, e.g.

dnf install -y lua-libs

Lastly, to run Slurm within the container, you’ll need to add the following bindings, which contain the Slurm binaries and the munge socket for authentication:

--bind /var/run/munge,/run/munge,/usr/lib64/libmunge.so.2,/usr/lib64/libmunge.so.2.0.0,/usr/local/slurm,/opt/slurm

With that, you’ll have access to Slurm from within the container, located at /usr/local/slurm/current/install/bin. You can for example add this directory to the PATH variable in your container recipe’s %environment section:

%environment
    export PATH="/usr/local/slurm/current/install/bin:${PATH}"