Getting Started

Assumption

Our Secure HPC aims to assume as little trust as possible, in line with modern zero-trust architectures. Nonetheless, by definition, two systems have to be trusted:

  1. The HPC System’s Image Server: We assume that the image server, which is part of the HPC system, is secure. It is located in a highly secure area of the HPC system, protected by multiple layers of security, and accessible only to a few essential services and administrators. This secure location helps us trust that the image server is safe from unauthorized access.
  2. The User’s Personal System (Secure Client): We also assume that your personal system, such as a laptop or workstation, is secure. This is crucial because your data begins its journey on your local system before being sent to Secure HPC.
Warning

It is important to understand that the secure client should be highly trusted by you. If your local system is not secure, your data could be compromised before it even reaches the secure workflow of Secure HPC. This is why we emphasize the term secure client—it signifies that your local system must be safeguarded with utmost care to ensure the overall security of your data.

These assumptions are essential because they ensure that the entire process, from start to finish, is secure. Trust in the system comes from knowing that both the initial and final stages of the process are protected.


Prerequisites

  • Access to HPC System: Access to the HPC system is required. If you don’t have an account, please refer to our getting an account page.

  • Linux-based Client to upload: As our client software currently only runs on Linux, a Linux-based operating system (such as Ubuntu, Debian, Fedora) is required.

  • Initial experience with job Submission with Slurm: Users should be familiar with job submission processes and using the Slurm workload manager. Please refer to our Slurm documentation for more details.

Application

In order to use Secure HPC, our admins have to provide you with:

  • Secure HPC Allocation: In order to provide a fully isolated node, each Secure HPC job gets its own exclusive node allocation.

  • Access to our HashiCorp Vault server: We will provide you with a token to authenticate against our Vault Key Management Server (KMS). This requirement is fulfilled when you have already contacted us and we have provided you with a token. The token has to be placed in a specific directory named secret. See Installation of required Software step in Installation Guidelines.

Still, you can prepare for Secure HPC even before you have access to the machines. There are two prerequisites:

  1. Port your current scripts and data to an Apptainer-based workflow
  2. Install the SecureHPC client on your secure Linux device

If you are still unsure whether Secure HPC fits your use case, see the Use Cases & Requirements section in our docs.

1. Port your current scripts and data to an Apptainer-based workflow

For this, Apptainer has to be installed on your local device. For installation instructions, visit the Apptainer installation guide.

As a basis, you can use this Apptainer definition, which is based on an Ubuntu Docker image (container.def):

Bootstrap: docker
From: ubuntu

%post
    export DEBIAN_FRONTEND=noninteractive
    # Install any packages you need here
    apt-get update && apt-get install -y --no-install-recommends \
        curl \
        wget \
        ca-certificates \
        tree \
        python3 \
        pkg-config

    # Any optional setups, custom libaries etc
    echo "Additional stuff... pip, Rscript..."

%runscript
    # This will be ignored, as we overwrite it later in the invocation
    exec echo "Hello World"

This can then be built from the command line via

apptainer build container.sif container.def

The usual use case is that the container is quite stateless (containing mostly the software packages needed), and that the data (and possibly scripts) are provided via a later encrypted bind-mount.

Next, let’s assume you put your Python script and your data into your ~/mydata folder, including the mycomputation.py script. You can then mount the data into the previously built container, and set the run command as follows:

# Description of all parameters:
# - `--nv` for nvidia support, in case you have a local graphics card
#
# - `--bind ~/mydata:/output/` says that the folder ~/mydata on the host system
#   should be available as /output in the container, as a writable mount
#
# - `./container.sif` is the previously built container, based on the
#   `./container.def` description
#
# - `bash -c '...'` overwrites the `%runscript` in the `./container.def`
#   with your custom command, allowing for a more dynamic iteration cycle

apptainer exec \
    --nv \
    --bind ~/mydata:/output/ \
    ./container.sif \
    bash -c 'python3 mycomputation.py > /output/computation.log 2>&1'

Note that, since mycomputation.py is run within the container, it has to load the data from the /output prefix.

Once that is working, porting it to Secure HPC is a fast and easy process.

2. Install the Secure HPC client on your secure Linux device

Next, the Secure HPC client can already be installed on your device, although it won’t be able to fully submit jobs without the authentication keys provided by us.

Here are the following steps:

  1. Installation of required Software:

    • Git: Version control system for managing code. For installation instructions, visit the Git installation guide.

    • Apptainer (formerly Singularity): Container platform for running applications in isolated environments. For installation instructions, visit the Apptainer installation guide.

    • Hashicorp Vault: Follow the instructions from the official website

    • Cryptsetup: Installation

      • On Debian based OS (Ubuntu, Mint, etc):

        sudo apt-get update
        sudo apt-get install cryptsetup
      • On RHEL based OS (Rocky Linux, Fedora, etc)

        sudo dnf update
        sudo dnf install cryptsetup

GPG is available by default on every Linux-based OS

  1. Clone the Secure HPC Git Repository: Open a terminal and clone the secure HPC git repo on the secure client home directory with the following command:

    git clone https://github.com/gwdg/secure-hpc.git
  2. If possible, adapt the template as much as possible

  • In case you already containerized your job as described above, replace the container.def content with your apptainer recipe.

  • In the command.sh.template, replace the singularity bash call with your own call

  • Include your own data into the data directory (which, by default, will be mounted in as /output into the container.

  • In the secure_sbatch, replace the USER, LOGIN_NODE, and EXEC_DIR with your preferences. Note that the EXEC_DIR will contain SLURM logs, thus this should not be publicly accessible.

  1. Create your local GPG key:

    • Generate GPG Key Pair:

      # Copy code
      gpg --full-generate-key

      Follow the prompts to create your key pair.

    • Upload Public Key to Vault: Use the instructions provided by your HPC administrator to upload your public key to Vault.

After that, once you’ve got your keys, you’ll be able to start using Secure HPC.

Debug Vault KMS Access

Note: This assumes that you already got your Vault Key.

Tip

Follow these steps if you want to verify that your token is valid:

  1. Set the Vault Server Address: export VAULT_ADDR='https://kms.hpc.gwdg.de:443'

  2. Set the token: export VAULT_TOKEN=$(cat secret/<local_uid>.token)

  3. Check Token Lookup: vault token lookup

    • If the token is valid, you will see output with details about the token (like its policies, expiration, etc.).
  • If the token is invalid, you’ll see an error message. Please report it to the administrator so we can fix it.