Home

This is the GWDG High Performance Computing (HPC) Documentation, organized by topic both in the navigation bar to the left and below. It applies to the entire Unified HPC System in Göttingen, which includes the Emmy CPU system and the Grete GPU system. The unified system serves many groups of users, including NHR, KISSKI and SCC users.

If you are new to HPC in general, we welcome you to our system and recommend the Start Here section.

In case you are returning to the system or have previous experience with other HPC systems, you might start from the How to use… section.

If you are searching for something specific, feel free to use the search function or browse through the pages.

If you have questions or run into problems, you can create a support ticket by sending an email with your question or problem to the appropriate support address.

Start Here

Use the information provided here to get started with using the HPC clusters at GWDG, which are arranged by topic in the sidebar and below.

You are from a specific science domain, have not used HPC before or are switching from another cluster? We have special starting pages for you! Search the list below to find your science domain and check what we offer for you specifically.

If you have questions or run into problems, you can create a support ticket by sending an email with your question or problem to the appropriate email address in the table below:

Email Address	Purpose
hpc-support@gwdg.de	General questions and problems (when in doubt, use this)
nhr-support@gwdg.de	for NHR users
kisski-support@gwdg.de	for KISSKI users
support@gwdg.de	Non-HPC issues (e.g. VPN)

Additionally, if you are on Matrix, you can discuss your issues or ask for help in the Matrix room.

Getting An Account

There are multiple High Performance Computing (HPC) systems hosted in Göttingen that are available to different groups of users. In order to access any of them in addition to using our AI Services you need an Academic Cloud account. Information on how to log in or how to register a new account are in the GWDG Docs.

NHR@Göttingen

NHR Application Process

KISSKI

KISSKI application process

Tip

You need an Academic Cloud account to access the Services like Chat AI. Use the federated login or create a new account. Details are on this page.

Applying for KISSKI resources requires a request for a specific service, which is listed on KISSKI Services page. An application can be started by clicking on the “book” button. Selected offers are:

Further services, including consulting and development services, can be found in the KISSKI service catalogue. More details about the application for compute resources and Chat-AI can be found in the KISSKI application process.

Scientific Compute Cluster (SCC)

Access to the HPC system is managed via the HPC Project Portal.

It is designed to allow project leaders to manage access to the project’s resources (compute, storage), add or remove members and more. Each leader of a work group, institute or faculty at the Georg-August University of Göttingen or the Max Planck society is eligible to request a long-running, generic project in the HPC Project Portal to grant HPC access to their employees and PhD Students. To collaborate with externals or people not employed by your institute, please request dedicated HPC Portal projects for each “real-world” research project. Similarly, for students who would like to use HPC resources for their theses, the supervisor should request an HPC project.

See Applying for HPC Portal projects on the SCC for more information.

Restrictions

Please note that our ability to grant access to the HPC system is subject to export control regulations.

Cluster Overview

The GWDG HPC Cluster is composed of several islands with similar hardware. Projects and user accounts are grouped based on the purpose of the compute and association with different groups/institutions that provided funding for parts of the cluster.

Project/Account Groups and Purposes

SCC

The SCC (Scientific Compute Cluster) provides HPC resources for

Georg-August University, including UMG
Max Planck Society
and other related research institutions

NHR (formerly HLRN)

NHR-NORD@Göttingen (NHR for short) is one NHR center in the NHR (Nationales Hochleistungsrechnen) Alliance of Tier 2 HPC centers, which provide HPC resources to all German universities (application required). NHR-NORD@Göttingen was previously part of HLRN (Norddeutscher Verbund für Hoch- und Höchstleistungsrechnen) IV, NHR’s predecessor, which provided HPC resources for universities in Northern Germany.

KISSKI

The KISSKI project provides AI compute resources to critical and sensitive infrastructure.

REACT

The REACT program is an EU-funded initiative to support various economic and social developments, including one of our GPU partitions.

Institution/Research-Group Specific

Some institutions and research groups have their own dedicated islands or nodes as part of GWDG’s HPC Hosting service in addition to being able to use SCC resources. These islands or nodes can consist of

Dedicated compute nodes, usually with their own Slurm partition (see CPU partitions and GPU partitions)
SCRATCH/WORK data store
Login node/s (see Logging In)

An example is the CIDBN island (also known as “Sofja”) with its own storage system, dedicated login nodes, and CPU partition.

DLR CARO

The other cluster GWDG operates is DLR CARO, which is for exclusive use by DLR employees. CARO is not connected to the GWDG HPC Cluster in any way. CARO’s documentation is only available on the DLR intranet. If you are a CARO user, you should go to its documentation and ignore the rest of this site.

Islands

The nodes can be grouped into islands that share the same/similar hardware, are more closely networked together, and have access to the same storage systems with similar performance. General CPU node islands are called “Emmy Phase X” where X indicates the hardware generation (1, 2, 3, …). General GPU node islands are called “Grete Phase X” where X indicates the hardware generation (1, 2, 3, …). The islands with a brief summary of their hardware are

Island	CPUs	GPUs	Fabric
Emmy Phase 3	Intel Sapphire Rapids		Omni-Path (100 Gb/s)
Emmy Phase 2	Intel Cascade Lake		Omni-Path (100 Gb/s)
Emmy Phase 1	Intel Skylake		Omni-Path (100 Gb/s)
Grete Phase 3	Intel Sapphire Rapids	Nvidia H100	Infiniband (2 × 200 Gb/s)
Grete Phase 2	AMD Zen 3 AMD Zen 2	Nvidia A100	Infiniband (2 × 200 Gb/s)
Grete Phase 1	Intel Skylake	Nvidia V100	Infiniband (100 Gb/s)
SCC Legacy (CPU)	Intel Cascade Lake Intel Skylake		Omni-Path (100 Gb/s) none (Ethernet only)
SCC Legacy (GPU)	Intel Cascade Lake	Nvidia V100 Nvidia Quadro RTX 5000	Omni-Path (2 × 100 Gb/s) Omni-Path (100 Gb/s)
CIDBN	AMD Zen2		Infiniband (100 Gb/s)
FG	AMD Zen3		RoCE (25 Gb/s)
SOE	AMD Zen2		RoCE (25 Gb/s)

Info

See CPU partitions and GPU partitions for the Slurm partitions in each island.

See Logging In for the best login nodes for each island (other login nodes will often work, but may have access to different storage systems and their hardware will be less of a match).

See Cluster Storage Map for the storage systems accessible from each island and their relative performance characteristics.

See Software Stacks for the available and default software stacks for each island.

Legacy SCC users only have access to the SCC Legacy island unless they are also CIDBN, FG, or SOE users in which case they also have access to those islands.

Types of User Accounts

Do you already have an HPC-enabled account, but are unsure what kind of account you have and which login nodes and partitions you should use? Have you heard terms like “legacy SCC/HLRN account” and are unsure of what that means or if it applies to you? Then this page is for you.

User Account Types

Our HPC cluster is made up of several node groups and cluster islands, funded by different institutions (see Cluster Overview for details). As we transition to a new project management system, you may encounter a confusing mix of account types, especially if you have more than one. The most important distinction is between Project Portal accounts and older, so-called legacy accounts. In addition, depending on your institutional affiliation, you are allowed to use different parts of our system and there are according differences between account types.

Project Portal accounts

If your username looks like u12345 (u and a five-digit number), your user account is a Project Portal account. You should have received an email from “GWDG HPC Project Portal (hpc-project-service@gwdg.de)” when you were added to the respective project. It is important to keep in mind that these are project-specific HPC accounts. They are only valid on the HPC cluster itself and are associated with / owned by an AcademicID. Within the HPC systems, each of your project-specific users (if you are a member of multiple projects) is a separate, full user account with its own data stores, its own quotas and allocation of resources it can use.

The AcademicID Username is your primary username, that you use for other services provided by GWDG or your university. It is used to login to the Project Portal, JupyterHPC, Chat AI or any other service that does not require SSH access. Your AcademicID Username can not be used to connect to the HPC cluster via SSH, unless it is a full employee account of the university of Göttingen, MPG or GWDG that has been separately activated as a legacy SCC account (see below).

When you log into the Project Portal, you will see an overview of projects you are a member of:

Example screenshot of the HPC Project Portal showing a list of a users' projects. — Project overview

On each project’s page, you will then see a number of links at the top detailing the tree structure your project is sorted under, starting with Projects.

Example screenshot of a portal projects' list of links detailing it's path in the project tree. — Project details hierarchy

This will tell you your account type

Project Tree Prefix	Account Type
`Projects / Extern / CIDAS`	treated like NHR
`Projects / Extern / EFRE-REACT GPU-Cluster für Maschinelles Lernen`	REACT
`Projects / Extern / KISSKI`	KISSKI
`Projects / Extern / NHR-NORD@Göttingen`	NHR
`Projects / Extern / Research Units`	treated like NHR
`Projects / Extern / Wirtschaftlicher Betrieb`	treated like KISSKI
`Projects / Scientific Compute Cluster (SCC)`	SCC

“Legacy” accounts

If your username starts with 3 letters that you could not choose, two of which are short for the German federal state of your university (followed by 5 letters/digits that you chose), you are a legacy NHR/HLRN user.
- Examples: nimjdoe, hbbmustr, bemhans1, mvilotte, …
If your user name is just your name, either in the form “first letter of first name, followed by your last name”, firstname.lastname or just your lastname, possibly followed by a number, you are most likely a legacy SCC user.
- Your regular GWDG-, Uni Göttingen employee-, UMG- or MPG-account, if it was ever activated for HPC usage, is considered a legacy SCC user account.
- Examples: If your name is John Doe, possible usernames are john.doe2, jdoe, doe15, …

File/Directory Access from multiple User Accounts

When you have multiple user accounts, each of them is completely separate and by default unable see or access files and directories belonging to the others. To see what storage locations are assigned to your current user, run show-quota. Environment variables like $HOME, $WORK and the hidden directory/symlink .project in your home directory (which contains more symlinks) also point to directories you can use. See the Data Migration Guide to learn how to copy/move files between your different users or configure directories to be accessible from all of them.

Overview

Type of user	login nodes¹	home filesystem	scratch²	partitions you can use³
Portal NHR	glogin, glogin-gpu	vast-nhr	lustre-mdc, lustre-grete	`medium96s`, `standard96(s)`, `large96(s)`, `huge96(s)`, `grete`, `jupyter`
Portal SCC	glogin, login-mdc, transfer-scc	vast-standard	scratch-scc	`medium`, `scc-cpu`, `scc-gpu`, `jupyter*`
KISSKI	glogin-gpu	vast-kisski	vast-kisski	`kisski`, `kisski-h100`, `grete:interactive`, `jupyter*`
REACT	glogin-gpu	vast-react	vast-react	`react`, `grete:interactive`, `jupyter*`
Legacy NHR/HLRN	glogin, glogin-gpu	vast-nhr	scratch-rzg	`medium96s`, `standard96(s)`, `large96(s)`, `huge96(s)`, `grete`, `jupyter`
Legacy SCC	login-mdc, transfer-scc	Stornext	scratch-scc	`medium`, `jupyter`

[1]: See Logging In for more details
[2]: See Storage Systems for more details
[3]: See Compute Partititons for more details

Info

See CPU Partitions and GPU Partitions for more information on the available partitions for your account.

Connecting (SSH)

There are several ways to login to and use the GWDG clusters. The most common way is using an SSH (Secure SHell) client to open a terminal on the cluster providing a command line interface. Various other tools such as many IDEs (VSCode, Emacs, etc.), file transfer programs (e.g. WinSCP), etc. can connect to the cluster using SSH under the hood, and thus require SSH to be setup.

Terminal Access via SSH

The terminal is the traditional and one of the most common methods of working on HPC clusters, providing an interface where one types commands and the output is printed to the screen just like in the image below. Follow the instructions for Installing SSH Clients followed by Generating SSH Keys, Uploading SSH Keys, and Configuring SSH.

Example terminal with a user SSH-ing into gwdu101.gwdg.de, decrypting their SSH key, and running 'ls -1 /scratch'. — Example Terminal
Example terminal with a user SSH-ing into gwdu101.gwdg.de, decrypting their SSH key, and running ls on the /scratch directory.

IDE Access via SSH

Many IDEs can edit files directly on the cluster, run commands on the cluster, and/or provide their own terminal interfaces over SSH. Some are bundled with an SSH client builtin, while others require it to be installed just like for terminal access (see Installing SSH Clients for instructions). Many support using the same config file as the OpenSSH client in order to configure access to the frontend nodes (~/.ssh/config), as well as SSH keys. See Generating SSH Keys, Uploading SSH Keys and Configuring SSH to configure SSH access.

Installing SSH Clients

Instructions for installing the most popular SSH clients on various operating systems.

Linux

On Linux, your desktop environment almost always has a terminal, typically with a name that includes a word like “term”, “terminal”, “console”, or some alternate spelling of them (e.g. Konsole on KDE). The OpenSSH client is usually already installed. To check if it is, pull up a terminal and see what the following command returns:

ssh -V

If it prints something like OpenSSH_9.2p1 [...], it is already installed. Otherwise, use your package manager to install it; it is usually called openssh-client, openssh-clients, or openssh depending on your Linux distribution. The respective command to install it from the terminal is given for several popular distributions:

install ssh client:

sudo apt install openssh-client

sudo dnf install openssh-clients

sudo yum install openssh-clients

sudo pacman -S openssh

Mac

Mac OS X and newer already have a terminal and OpenSSH client installed, so nothing more has to be done. The builtin terminal program’s name is Terminal. If you need X11 forwarding in your terminal, you will additionally need to install and use XQuartz. If you are looking for a very powerful terminal emulator, check out iTerm2.

Windows

There are 3 popular options, each detailed below. Note that only MobaXterm provides X11 forwarding for running remote applications with a GUI.

OpenSSH (Windows 10 or newer)

The already installed PowerShell (or the classic cmd.exe) provides the terminal. They should be listed in the Start menu. To check if OpenSSH is installed, run

ssh --version

which will print the OpenSSH client’s version if it is present, and fail if it isn’t installed. If it is not installed, re-run PowerShell as an administrator (right click on it’s Start menu entry to see this option) and install it with

Add-WindowsCapability -Online -Name OpenSSH.Client~~~~0.0.1.0

Then confirm that it works with

ssh --version

Additional instructions can be found in Microsoft’s documentation.

Please see the SSH Troubleshooting section if you encounter problems.

MobaXterm

MobaXterm is a popular SSH client and terminal combination supporting tabs and X11 forwarding. Go to it’s website to download and install it.

PuTTY

PuTTY is another popular SSH client and terminal combination. Go to it’s website to download and install it.

Generating SSH Keys

After making sure your SSH client is installed, the next step is to generate your SSH keys. SSH keys are used to authenticate your client to the cluster frontend nodes as opposed to a password, as well as getting the encrypted session going. After generating the keys, you have to upload your public key so you can authenticate to the cluster in the future.

SSH Key Basics

An SSH key has two parts: the public key that you can share, and the private key which you must protect and never share. In all clients other than PuTTY (and those that use it), the two parts have the same filename except that the public key file has the added extention .pub for public. In PuTTY, the private key has the extention .ppk and the public key can be saved with an arbitrary extension. The best ways to protect the private key is to either encrypt it or to store it on a security key/card with a PIN.

Warning

It is CRITICAL that you protect your SSH private key. It should never leave your computer/device! Anyone with the private key and the means to decrypt it (assuming you encrypted it at all) can impersonate you to the cluster to corrupt or delete your files, use up all your compute time, and/or cause problems in your name.

You should always encrypt your SSH private keys or store them on a security key/card with a PIN. Do not copy them onto USB thumb drives, external hard disks, etc.! Do not send them to other people or upload them anywhere! Remember it violates our Terms of Use to share your account or private SSH key(s) with other persons. If you think someone might have gotten hold of your private key (stolen device, etc.), please immediately delete them from your HPC account! If you need help, do not hesitate to contact our support.

Generate SSH Key

If your client supports it, you should generate an ed25519 key since it is both fast and secure. You can also use ed25519-sk if you have and want to use a compatible FIDO2 security key for 2FA (Second Factor Authentication). Otherwise, you should create a 4096-bit rsa key which is a pretty much universally supported and safe but much larger and slower fallback. See the FAQ page on keys for more information on these keys. Instructions for generating keys for several clients are given below.

MobaXterm

Open MobaXterm
Click “Start local terminal server”
Generate an SSH key the same way as for an OpenSSH client following the instructions below.

OpenSSH in Terminal (Linux, Mac, Windows PowerShell)

To generate a key with the filename KEYNAME (traditionally, one would choose ~/.ssh/id_NAME where NAME is a convenient name to help keep track of the key), you should run the following in your terminal on your local machine

generate key with OpenSSH:

ssh-keygen -t ed25519 -f KEYNAME

ssh-keygen -t rsa -b 4096 -f KEYNAME

and provide a passphrase to encrypt the key with. Choose a secure passphrase to encrypt your private key that you can remember but others cannot figure out! Then the file ~/.ssh/id_NAME is your private key and ~/.ssh/id_NAME.pub is your public key. Your terminal should look something like the following:

foo@mylaptop:~> ssh-keygen -t ed25519 -f ~/.ssh/id_test
Generating public/private ed25519 key pair.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/foo/.ssh/id_test
Your public key has been saved in /home/foo/.ssh/id_test.pub
The key fingerprint is:
SHA256:54NGZLI2MQSowPoqviWFDlJ5S5KHtDwcbjPdHHaEtUY foo@mylaptop
The key's randomart image is:
+--[ED25519 256]--+
|. o...++E        |
|.*.B =.+ .       |
|o./ = * =        |
|oo.O . O         |
|oo .. + S .      |
|+ o  . o +       |
| + .    o o      |
|o o    .   .     |
|oo.              |
+----[SHA256]-----+
foo@mylaptop:~>

PuTTY

SSH keys are generated by the program PuTTYgen. To create a key pair, follow these steps:

Open PuTTYgen
Set the key parameters at the bottom to either “EdDSA” and “Ed25519 (255 bits)” for ed25519 (best) or “RSA” and “4096 bits” for rsa (fallback).
Click Generate and follow the instructions to generate the key
Enter a passphrase to encrypt the key. Choose a secure passphrase that you can remember but others cannot figure out!
Click both Save private key and Save public key to save the respective key to disk. Note that PuTTY uses a different storage convention than other clients.

Screenshot of PuTTYgen creating an ed25519 key. — PuTTYgen ed25519
Generating an ed25519 key with PuTTYgen.

Uploading SSH Keys

Now that you have generated your SSH keys, you should upload the public key (NOT the private key) to the authentication system.

Copy SSH Key

First, you must get the public key in the form to upload and copy it. The text you will upload should look something like:

Example public key:

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIEgOP7sQ2YydiyHVjFVCzBcX20lM10U0wPKNtY9sUu8q foo@mylaptop

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDMd+BFyyJ1d8bKznh8FmKhguSLN/6x2B3mK1zxjiw9Rq5KA4dPGNGULLUhwEd1oracXdi/sMbK9ZduCYt7L19r/K2v4OS9T4IGdMYWR1EbIvFkAvgnnZ2Mo0zDpQtDEn/sbJXbgfexc+ymSXAehX6cSn+7Hq8beBX0OFfcIyHPm15VeI0S0sxn5SEjw78lFiNRFlFWphtGflr7OLQUugXBNqailduGWswbpXBp622sdjI3Cj7DHnJh/7GQhAuGXvAedDEcPrRjYXPIY4nQ3KxtevPDwye4+PBO97FwAd02QQXOrenFhXqbAuG0+rEtCWii/ETKB1BYWmXW+atyMdKT18+nW34VqOeNDoQuwXwri1TrqYo4XwjKB33rgAwsYdUVBVIGxc8kyrSsGEiLHKgl18A6OST0ATKXmuhNBt6PjTrU/tV80YAZD2ACGmhyB2plR+p/mVURcv7KuHyclJ/dnk7i/5Iw4PMQBfvALuUho9sIk9EX09P+zvC8TpcTS5gesYJGvocuIF+iVjyQQmUqKquR/5C+mUhtJ2Iqliks3mIkVzzzoEPNWc/Hz50Iu0VeW/YggZhllFHjzeAinE+yl2lpQzD+1bFgruXSiAkiN/QUu3WEI/HDX4g5vRurBKd/wzK201ygOm0wYCthJzQKKM7R5TZYHitXC05omf35Pw== foo@mylaptop

If your SSH key is stored in the same style as OpenSSH (whether made by OpenSSH, MobaXterm, etc.), open your public key file in a plaintext editor such as Kate or Gedit (Linux), TextEdit (Mac), Notepad (Windows), VSCode, Vim, Emacs, etc. From there, copy your public key to the clipboard so you can paste it for uploading.

Warning

Do NOT use a document editor like LibreOffice or Microsoft Office. They can add invisible formatting characters, non-ASCII characters, newlines, etc. which can make the copied key invalid and will prevent you from logging in.

If you generated your SSH key with PuTTYgen, open the .ppk file in PuTTYgen by clicking Load. The text to copy is highlighted in the screenshot below

Screenshot of PuTTYgen with the public key text selected. — Public key in PuTTYgen
Selecting the public key text in PuTTYgen to copy it.

Upload Key

How you upload your SSH public key to your account depends on the kind of account you have. To determine your upload method, consult the table below and then go to the respective section:

Account Description	Upload Method
HLRN account	HLRN Upload
NHR accounts created before 2024/Q2	HLRN Upload
Georg-August University account	Academic Cloud Upload
MPG account	Academic Cloud Upload
Other academic institution serviced directly by GWDG	Academic Cloud Upload
Other institution part of the federated login	Academic Cloud Upload
Project Portal account	Academic Cloud Upload
Other SCC account	Academic Cloud Upload
KISSKI account	Academic Cloud Upload

Academic Cloud Upload

First, go to Academic Cloud, which is shown in the screenshot below:

Screenshot of the Academic Cloud landing page at https://academiccloud.de. — Academic Cloud Landing Page
Location to login to in order to upload SSH key.

Click “Login” and then proceed to enter your primary email address with GWDG (for academic institutions serviced direcly by GWDG like the MPG and Georg-August university, this will be your institutional email address) or your institution if it is part of the federated login or the email address you used if you created an account on this page previously, as shown in the screenshot below. Then enter your password and do any 2FA/MFA you have enabled.

Screenshot of the Academic Cloud login page. — Academic Cloud Login Page
Enter login email address, password, and 2FA/MFA (if used) here to login.

After you have logged in, you will get a screen like the one below. Click on the option menu next to your name at the top-right and select “Profile” to get the your account settings.

Screenshot of the Academic Cloud main page after login, with the user's information blacked out. — Academic Cloud Main Page
Main page after logging in (user’s information blacked out).

In the Account settings, you will see information about your account like your username on the cluster, Academic ID number, email addresses you have registered, etc. as shown in the screenshot below. Click “Security” in the left navigation menu (pointed at by the red arrow).

Screenshot of the Academic Cloud account settings page with user's information blacked out. — Academic Cloud Account Settings
Account settings page (user’s information blacked out), with a red arrow indicating how to get to the security panel.

In the security panel, towards the bottom, there is a box for your SSH Public Keys as shown in the screenshot below. The SSH public keys you have uploaded so far are listed here. Click on the “ADD SSH PUBLIC KEY” button pointed at by the red arrow to add your key. Then, in the window that pops up (center of the screenshot), paste the key you copied earlier and optionally add a suitable comment. Click “ADD” to upload it.

Pop-up window requesting an SSH public key. It includes a text field labeled ssh-ed25519 AAAAC3N... and a smaller field for a comment. Buttons labeled Cancel and Add are present at the bottom of the window. Text below the fields explains the key should consist of at least three parts algorithm, key and comment.

Note

It can take a few minutes until the key is synchronized with the systems. If the login does not work immediately, please try again later.

HLRN Upload

If you have an old HLRN username (like for example nibjdoe or beijohnd), you can login to AcademicID with the email <HLRN username>@hlrn.de. Your account does not have a password set by default, please use the password reset function of AcademicCloud. Please also use <HLRN username>@hlrn.de as the address to send the new password to. It will be forwarded to your real email address you used to register the original HLRN account, but directly specifying your real email won’t work!

Afterwards you can upload your SSH key just like other AcademicCloud users. Please note that your account has a different username in AcademicCloud, which you will see under AcademicID, but it is not relevant or used anywhere else.

If you want to use the NHR@ZIB systems in Berlin that are not run by GWDG, please visit https://portal.nhr.zib.de/

Deleting lost/stolen SSH keys

To delete an old key that you don’t need any more, have lost or that might have been compromised, start by following the same steps as detailed above. In the last step, at the point where you can add new keys, you will see a list of your existing ones with a trash can icon next to each (for example the login key in the last screenshot). Click it and confirm the deletion in the following dialog.

Configuring SSH

OpenSSH, sftp, rsync, VSCode, …

The OpenSSH configuration file is a plain text file that defines Hosts with short, easy to type names and corresponding full DNS names (or IPs) and advanced configuration options, which private key to use, etc. This file being picked up and read by multiple SSH clients or other applications that use the SSH protocol to connect to the cluster. Most clients let you override the defaults in this file for any given login.

Config file location

It is usually located in your home directory or user profile directory under .ssh/config, i.e.

OpenSSH config location

~/.ssh/config
/home/$USERNAME/.ssh/config

~/.ssh/config
/User/$USERNAME/.ssh/config

%USERPROFILE%\.ssh\config
C:\Users\your_username\.ssh\config

OpenSSH config file format

After general options that apply to all SSH connections, you can define a line Host my_shorthand, followed by options that will apply to this host only, until another Host xyz line is encountered. See the OpenSSH documentation for a full list of available options.

Simple configuration examples

Here are different blocks of configuration options you can copy-paste into your config file. You can mix-and-match these, choose the ones you need. Just make sure to replace User xyz with your respective username and if necessary, the path/filename of your private key.

Emmy / Grete (NHR and SCC)

While “Emmy” was originally the name of our NHR/HLRN CPU cluster, we will sometimes refer to it as a generic term for all our CPU partitions. For example, the scc-cpu partition is integrated with Emmy Phase 3 from a technical standpoint, thus Emmy-p3 is the recommended login for SCC users. The same applies to “Grete”, which can be used as the generic login if you want to use any of our GPU partitions.

Host Emmy
	Hostname glogin.hpc.gwdg.de
	User u12345
	IdentityFile ~/.ssh/id_ed25519

Host Emmy-p2
	Hostname glogin-p2.hpc.gwdg.de
	User u12345
	IdentityFile ~/.ssh/id_ed25519

Host Emmy-p3
	Hostname glogin-p3.hpc.gwdg.de
	User u12345
	IdentityFile ~/.ssh/id_ed25519

Host Grete
	Hostname glogin-gpu.hpc.gwdg.de
	User u12345
	IdentityFile ~/.ssh/id_ed25519

KISSKI

Host KISSKI
	Hostname glogin-gpu.hpc.gwdg.de
	User u56789
	IdentityFile ~/.ssh/id_ed25519

Legacy SCC

The legacy SCC login nodes are not reachable from outside GÖNET. You can use the generic login nodes as jumphosts, to avoid the need for VPN if you want to connect from the outside.

Host jumphost
	Hostname glogin.hpc.gwdg.de
	User jdoe1
	IdentityFile ~/.ssh/id_ed25519

Host SCC-legacy
	Hostname login-mdc.hpc.gwdg.de
	User jdoe1
	IdentityFile ~/.ssh/id_ed25519
	ProxyJump jumphost

Tip

You can leave out the jumphost block and ProxyJump jumphost if you always connect from within GÖNET (basically the extended campus network in Göttingen) or use VPN.

Advanced configuration examples

Here are more advanced example configuration for all our clusters with convenient shortcuts for advanced users. Adapt to your needs. To understand this, remember ssh parses it’s config top-down and first-come-first-serve.

That means, if for a given Host the same option appears multiple times, the first one counts and subsequent values are ignored. But for options that were not defined earlier, values will apply from successive appearances. Note that if Hostname is not specified, the Host value (%h) is used.

All frontend nodes, individually specified

# NHR, SCC Project-Portal and KISSKI users
Host glogin*
	Hostname %h.hpc.gwdg.de
	User u12345
	IdentityFile ~/.ssh/id_ed25519


# Use the main login nodes as jumphosts to restricted login nodes
Host jumphost
	Hostname glogin.hpc.gwdg.de
	User jdoe1
	IdentityFile ~/.ssh/id_ed25519

# legacy SCC login nodes
Host gwdu101 gwdu102 login-mdc
	Hostname %h.hpc.gwdg.de
	User jdoe1
	IdentityFile ~/.ssh/id_ed25519
	ProxyJump jumphost

# CIDBN
Host login-dbn*
	Hostname %h.hpc.gwdg.de
	User u23456
	IdentityFile ~/.ssh/id_ed25519
	ProxyJump jumphost

# Cramer/Soeding
Host ngs01
	Hostname ngs01.hpc.gwdg.de
	User u34567
	IdentityFile ~/.ssh/id_ed25519
	ProxyJump jumphost

Complex example using config file tricks

Host myNHRproject1
	User u10123
Host myNHRproject2
	User u10456
Host NHR myNHRproject1 myNHRproject2
	Hostname glogin.hpc.gwdg.de
	IdentityFile ~/.ssh/id_ed25519
Host kisski
	Hostname glogin-gpu.hpc.gwdg.de
	User u10789
	IdentityFile ~/.ssh/id_ed25519
Host glogin* glogin-gpu glogin-p3
	Hostname %h.hpc.gwdg.de
	User nimjdoe
	IdentityFile ~/.ssh/id_ed25519

Host jumphost
	Hostname glogin.hpc.gwdg.de
Host SCC-legacy
	Hostname login-mdc.hpc.gwdg.de
Host SCC-legacy gwdu101 gwdu102 jumphost
	Hostname %h.gwdg.de
	User jdoe1
	IdentityFile ~/.ssh/id_ed25519
Host SCC-legacy gwdu101 gwdu102
	ProxyJump jumphost

Logging In

Here, you’ll find DNS names for our login nodes, sorted by cluster, their hostkey fingerprints and some examples showing how you can connect from the command line.

Names and Aliases

The proper DNS names of all login nodes grouped by cluster island and their aliases are provided in the table below. General CPU node islands are called “Emmy Phase X” where X indicates the hardware generation (1, 2, 3, …). General GPU node islands are called “Grete Phase X” where X indicates the hardware generation (1, 2, 3, …). Other islands exist for specific institutions/groups or historical reasons (e.g. SCC Legacy). It is best to use the login nodes for the island you are working with. Other login nodes will often work (assuming you have access), but may have access to different storage systems and their hardware will be less of a match. For square brackets with a number range, substitute any number in the range.

Island	Login node(s)	Aliases
Emmy Phase 2	glogin[4-8].hpc.gwdg.de	glogin-p2.hpc.gwdg.de glogin.hpc.gwdg.de glogin.hlrn.de (deprecated) glogin[4-8].hlrn.de (deprecated)
Emmy Phase 3	glogin[11-13].hpc.gwdg.de	glogin-p3.hpc.gwdg.de
Grete (all phases)	glogin[9-10].hpc.gwdg.de	glogin-gpu.hpc.gwdg.de
SCC Legacy	gwdu[101-102].hpc.gwdg.de	login-mdc.hpc.gwdg.de login-mdc[1-2].hpc.gwdg.de gwdu[101-102].gwdg.de (deprecated)
CIDBN (restricted)	login-dbn[01-02].hpc.gwdg.de	login-dbn.hpc.gwdg.de
FG/SOE (restricted)	ngs01.hpc.gwdg.de

Note

The login nodes marked as restricted in the table above are restricted to specific research institutions/groups.

Aliases marked as deprecated will eventually disappear and stop working.

Warning

Legacy SCC users do not have HOME directories on the Emmy and Grete phases and thus should not login to those nodes (using them as a jumphost is fine).

Non-SCC users do not have HOME directories on the SCC Legacy island and thus should not login to those nodes.

See Types of User Accounts if you are unsure what kind of account you have.

Warning

The SCC Legacy login nodes are not reachable directly from outside GÖNET if you are not using VPN. You may have to use the NHR login nodes as jumphosts with the -J command line switch or the ProxyJump option.

Info

Use the login nodes for the island of compute nodes you intend to use in order to get a closer match of the CPU architecture and to have the same default software stack as their respective compute nodes. Compilers, by default, will either compile for the most generic version of the host’s architecture (poor performance) or for the exact CPU the host is running (which could then crash and/or have sub-optimal performance on a compute node with a different CPU architecture). You can compile software on the compute nodes themselves (in a job) or on the login nodes.

Info

See CPU Partitions and GPU Partitions for the available partitions in each island for each kind of account.

See Software Stacks for the available and default software stacks for each island.

See Cluster Storage Map for the storage systems accessible from each island and their relative performance characteristics.

SSH key fingerprints

When first connecting, SSH will warn you that the host’s key is unknown and to make sure that it is correct, in order to avoid man-in-the-middle attacks. The following table contains all SHA-256 fingerprints of the host keys of the login nodes and jumphosts, arranged by key algorithm:

Hostkey fingerprints

Node(s)	sha256 fingerprint ed25519
gwdu[101-102].hpc.gwdg.de login-dbn[01-02].hpc.gwdg.de glogin[1-13].hpc.gwdg.de ngs01.hpc.gwdg.de	`SHA256:PPK0aO2QZ/k4duUx18Pp5AOKG/gFEBHgw/bl8vg9oJk`

Node(s)	sha256 fingerprint rsa
gwdu[101-102].hpc.gwdg.de login-dbn[01-02].hpc.gwdg.de glogin[1-13].hpc.gwdg.de ngs01.hpc.gwdg.de	`SHA256:EJyZLROEobVuCm2hSeEhcAIEB80PbZ85U4u4XNnvM4k`

Note

The SSH key fingerprints are not the public keys themselves, but a checksum of them. Fingerprints are a compact, fixed-size way to verify SSH keys, as the keys themselves can be large enough to be unwieldy. The lines in ~/.ssh/known_hosts on your client machine store the SSH public keys of each known server whose key has been accepted, rather than the fingerprints. Do not copy and paste SSH key fingerprints into ~/.ssh/known_hosts.

Note

When you connect to a login node for the first time or when its SSH key changes, your SSH client will show you the new fingerprint (and the old one if it changed). If it is your first time connecting to the node, it will ask you to check the fingerprint and accept it if it’s correct. If it has changed, it will tell you that you need to delete the old key in ~/.ssh/known_hosts. With OpenSSH’s client, it will print the line number of the line to be deleted and often a command you can use to remove the offending line/s more easily. You can also use ssh-keygen -R to remove outdated hostkeys.

For example: ssh-keygen -R glogin.hpc.gwdg.de
(You might have to repeat the command with any other combination of aliases and hostnames you have used in the past, i.e. glogin9.hpc.gwdg.de, glogin.hlrn.de, etc.)

After that, you’ll be able to confirm the hostkey on your next login with the fingerprints from the table above.

Example Logins with OpenSSH

With the .ssh/config file setup as in the Simple configuration examples, you can run just ssh Emmy-p3. These are the recommended login nodes for SCC users and NHR users who use the Sapphire Rapids nodes (e.g. the standard96s partition). As an SCC user, your terminal session could look like

jdoe1@laptop:~> ssh Emmy-p3
Enter passphrase for key '/home/jdoe1/.ssh/id_ed25519':
Loading software stack: gwdg-lmod
Found project directory, setting $PROJECT_DIR to '/projects/scc/GWDG/GWDG_AGC/scc_agc_test_accounts/dir.project'
 __          ________ _      _____ ____  __  __ ______   _______ ____
 \ \        / /  ____| |    / ____/ __ \|  \/  |  ____| |__   __/ __ \
  \ \  /\  / /| |__  | |   | |   | |  | | \  / | |__       | | | |  | |
   \ \/  \/ / |  __| | |   | |   | |  | | |\/| |  __|      | | | |  | |
    \  /\  /  | |____| |___| |___| |__| | |  | | |____     | | | |__| |
  ___\/__\/   |______|______\_____\____/|_|__|_|______|    |_|  \____/
 |__   __| |  | |  ____|  / ____|/ ____/ ____|
    | |  | |__| | |__    | (___ | |   | |
    | |  |  __  |  __|    \___ \| |   | |
    | |  | |  | | |____   ____) | |___| |____
    |_|  |_|  |_|______| |_____/ \_____\_____|

 Documentation:  https://docs.hpc.gwdg.de
 Support:        hpc-support@gwdg.de

PARTITION    NODES (BUSY/IDLE)     LOGIN NODES
medium             95 /    0     login-mdc.hpc.gwdg.de
scc-cpu            49 /    0     glogin-p3.hpc.gwdg.de
Slurm load last updated 11.8 seconds ago
Your current login node is part of glogin-p3
u12345@glogin11 ~ $

If you are not using a configured Host entry (not recommended) or want to know how to connect “manually” (useful for troubleshooting), you can run:

ssh u12345@glogin-p3.hpc.gwdg.de -i ~/.ssh/id_ed25519

Make sure to replace u12345 with your actual username and ~/.ssh/id_ed25519 with the path to your private key (if you have not used the default).

With the .ssh/config file setup as in the Simple configuration examples, you can run just ssh Grete. As an NHR user, your terminal session could look like

jdoe1@laptop:~> ssh Grete
Enter passphrase for key '/home/jdoe1/.ssh/id_ed25519':
Loading software stack: gwdg-lmod
Found scratch directory, setting $WORK to '/scratch/usr/nimjdoe'
Found temporary files directory, setting $TMPDIR to '/scratch/tmp/nimjdoe'
 __          ________ _      _____ ____  __  __ ______   _______ ____
 \ \        / /  ____| |    / ____/ __ \|  \/  |  ____| |__   __/ __ \
  \ \  /\  / /| |__  | |   | |   | |  | | \  / | |__       | | | |  | |
   \ \/  \/ / |  __| | |   | |   | |  | | |\/| |  __|      | | | |  | |
    \  /\  /  | |____| |___| |___| |__| | |  | | |____     | | | |__| |
  _  \/ _\/  _|______|______\_____\____/|_|  |_|______|____|_|__\____/
 | \ | | |  | |  __ \     ____    / ____\ \        / /  __ \ / ____|
 |  \| | |__| | |__) |   / __ \  | |  __ \ \  /\  / /| |  | | |  __
 | . ` |  __  |  _  /   / / _` | | | |_ | \ \/  \/ / | |  | | | |_ |
 | |\  | |  | | | \ \  | | (_| | | |__| |  \  /\  /  | |__| | |__| |
 |_| \_|_|  |_|_|  \_\  \ \__,_|  \_____|   \/  \/   |_____/ \_____|
                         \____/

 Documentation  https://docs.hpc.gwdg.de   Support nhr-support@gwdg.de
PARTITION    NODES (BUSY/IDLE)     LOGIN NODES
grete:shared       53 /    9     glogin-gpu.hpc.gwdg.de
grete-h100          4 /    0     glogin-gpu.hpc.gwdg.de
Your current login node is part of glogin-gpu
[nimjdoe@glogin9 ~]$

If you are not using a configured Host entry (not recommended) or want to know how to connect “manually” (useful for troubleshooting), you can run:

ssh u12345@glogin-gpu.hpc.gwdg.de -i ~/.ssh/id_ed25519

Make sure to replace u12345 with your actual username and ~/.ssh/id_ed25519 with the path to your private key (if you have not used the default).

With the .ssh/config file setup as in the Simple configuration examples, you can just run ssh SCC. Your terminal session could look something like

jdoe1@laptop:~> ssh SCC
Enter passphrase for key '/home/jdoe1/.ssh/id_ed25519':
Last login: Tue May 20 16:27:44 2025 from 10.45.112.11
Loading software stack: gwdg-lmod
 __          ________ _      _____ ____  __  __ ______   _______ ____
 \ \        / /  ____| |    / ____/ __ \|  \/  |  ____| |__   __/ __ \
  \ \  /\  / /| |__  | |   | |   | |  | | \  / | |__       | | | |  | |
   \ \/  \/ / |  __| | |   | |   | |  | | |\/| |  __|      | | | |  | |
    \  /\  /  | |____| |___| |___| |__| | |  | | |____     | | | |__| |
  ___\/__\/   |______|______\_____\____/|_|__|_|______|    |_|  \____/
 |__   __| |  | |  ____|  / ____|/ ____/ ____|
    | |  | |__| | |__    | (___ | |   | |
    | |  |  __  |  __|    \___ \| |   | |
    | |  | |  | | |____   ____) | |___| |____
    |_|  |_|  |_|______| |_____/ \_____\_____|

 Documentation:  https://docs.hpc.gwdg.de
 Support:        hpc-support@gwdg.de

PARTITION    NODES (BUSY/IDLE)     LOGIN NODES
medium             94 /    0     login-mdc.hpc.gwdg.de
scc-cpu           236 /   79     glogin-p3.hpc.gwdg.de
scc-gpu            21 /    5     glogin-gpu.hpc.gwdg.de
Your current login node is part of login-mdc
[scc_project] u12345@gwdu101 ~ $

When using a jumphost, you may have to enter the passphrase for your private key twice (once for the jumphost and once for the actual login node).

If you are not using a configured Host entry (not recommended) or want to know how to connect “manually” (useful for troubleshooting), here is how you would do so (using a jumphost):

ssh jdoe1@login-mdc.hpc.gwdg.de -i .ssh/id_ed25519 -J jdoe1@glogin.hpc.gwdg.de

Or, if you are using an older SSH client:

ssh jdoe1@login-mdc.hpc.gwdg.de -i .ssh/id_ed25519 -o ProxyCommand="ssh -i .ssh/id-rsa -W %h:%p jdoe1@glogin.hpc.gwdg.de"

Make sure to replace jdoe1 with your actual username.

Logging into a specific node

This is important if you need to

Reconnect to a tmux or screen session
Reconnect to a session started by an IDE over SSH
Use a dedicated login node for your research group
Use a login node with particular hardware

With the .ssh/config file setup as in the Advanced configuration examples for all nodes configured individually, you would run just ssh NODE where NODE is the name of the node or a suitable alias.

SSH Troubleshooting

Troubleshooting

Various SSH problems and how to solve them are listed below. For general SSH questions not related to problems, see the SSH FAQ page instead.

External helpful articles and solutions

WSL 2 and VPN connection

"Corrupted MAC" errors on Windows

The official SSH component of Windows has a bug in the implementation of the umac-128-etm@openssh.com MAC algorithm and prefers this algorithm over others that are not bugged. When connecting to the login nodes, the bug trips the corrupted MAC detection and the error Corrupted MAC on input is reported.

To avoid the issue, add -m hmac-sha2-256-etm@openssh.com to your ssh command line, use a better SSH client, or change your configuration as shown below.

Reduce the priority of the buggy MAC algorithm

Another way is to override the MAC used or change the priority list to de-prioritize (or remove) the buggy MAC. The easiest way to do this in a persistent way is to change your .ssh/config file, which is at %USERPROFILE%\.ssh\config on Windows (this usually works out to be something like C:\Users\YOUR_USERNAME\.ssh\config). Go to the Host entries for the login nodes and add the option MACs hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,umac-128-etm@openssh.com to them, which will change the priority order to use non-buggy ones first but still retain the problematic one for if you upgrade OpenSSH later.

You can also add this to a Host * block to enable it for all hosts. Since the buggy implementation is not disabled, just de-prioritized, it should not cause connection problems with any other servers.

See the Configuring SSH page for more information on the config file, particularly if you have not made one yet.

It is also possible to override the default MAC when running SSH on the command line with the -m MAC1[,...,MACN] option. Examples would be -m hmac-sha2-256-etm@openssh.com to pick just one or -m hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,umac-128-etm@openssh.com to specify a list of ones use in order of priority.

Complete example: ssh -m hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com -i C:\Users\local_username\.ssh\id_ed25519 username@glogin.hpc.gwdg.de

Error: "permission denied"

There are many possible causes for “permission denied” errors, which have to be eliminated one by one. These are covered in the sections below. If at the end, the error still persists, you will need to start a support ticket. If possible, you should first collect verbose output from your ssh client if possible (use the -v option for OpenSSH) and include that in the support ticket. See the Start Here page for where to send your support ticket.

Check that Your Username Is Correct

It is critical that your client pass the right username. This can be specified in your .ssh/config file for clients that support it like OpenSSH (see the Configuring SSH page for more information). Other clients, like PuTTY, need it specified in their own configuration formats/dialogs. Or you can specify it manually to some clients, like most command line clients (includes OpenSSH) when specifying the host. The common format is USERNAME@HOST.

The question then is, what is the correct username to use. That depends on how your user account works. The subsections below handle each case

Member of a Project from the Project Portal

Projects in the new Project Portal give a project-specific username to each member. You must use the project-specific username for the particular project you are working on, which was sent to you by an email notification when you were added to the project and can also be gotten by using the Project Portal. These usernames have the form uXXXXX where each X is a digit. They use the same SSH key as your Academic Cloud/ID account (see the Uploading SSH Keys).

Legacy NHR/HLRN Users (before Project Portal)

Use username you received when you applied for your NHR/HLRN account. Note that if your project has migrated to the Project Portal and your legacy account has been removed from it in favor of a new project-specific username, you must use that instead.

SCC Users before Project Portal

You need to log into the Academic Cloud and get your username. Follow the first few steps on the Upload SSH Key -> Academic Cloud Upload page and get the username listed under account information.

Check that SSH is Using the Right SSH Key

First, check that the SSH key you are giving to your SSH client is the same one you uploaded for your account. The public key should match what you uploaded. See Upload SSH Key for more information. In particular, if you used a document editor for copying your SSH key before pasting it in, you might have to re-upload it but use a plain text editor to do the copying (document editors add formatting characters sometimes).

Make sure the right key is referenced in your .ssh/config file if your client uses it (OpenSSH and many others, but not PuTTY). See Configuring SSH for more information. You can also tell the OpenSSH client which key to use on the command line with the -i KEY argument where KEY is the path to your key.

Whether you can do this or not is a very useful diagnostic. The jumphosts don’t require HPC access to be enabled and are not part of the HPC (different problems, hopefully). For them, even if you have a project-specific username, you should use the primary username you would find if you follow the first few steps on the Upload SSH Key -> Academic Cloud Upload page and get the username listed under account information. If you have multiple Academic Cloud accounts you need to use the one for which you have requested the HPC access for!

SCC Users before Project Portal Must Request Access

If you are an SCC user who has not been added to a project in the new Project Portal, you must request HPC access be enabled for your account if you have not previously done so. Start a support ticket by sending an email to hpc-support@gwdg.de. Don’t forget to include your username in the email. See Account Activation for more information.

SCC Users using the jumphosts

With the jumphosts, it is critical that you setup your client to use them as jumphosts. SSH-ing manually into a jumphost and then trying to SSH into a login node will not work as your private key is not on the jumphost itself (and shouldn’t be). See Configuring SSH and Logging In for more information on the jumphost configuration.

Warning: unprotected key file

The login fails with a message like the following:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@         WARNING: UNPROTECTED PRIVATE KEY FILE!          @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions 0777 for '.ssh/id_rsa' are too open.
It is required that your private key files are NOT accessible by others.
This private key will be ignored.
jdoe1@glogin10.hpc.gwdg.de: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,hostbased).

This indicates that the file permissions of your private key file are too open. It has nothing to do with the passphrase set for the SSH key. The solution is to make the private key only readable to your own user with: chmod 600 <path/to/your-key>

You can verify the result by either directly trying to log in again, or use ls -l <path/to/your-key>:

jdoe1@laptop:~> ls -l ~/.ssh/id_rsa
-rw------- 1 jdoe1 jdoe1 1766 Jun 13  2031 /home/jdoe1/.ssh/id_rsa

The Unix file permissions -rw------- show that the file is only readable and writable for its owner, the user jdoe1.

Windows Subsystem for Linux (WSL)

If you tried changing the file permissions and you still cannot connect, please veryfy with ls -l <path/to/your-key> that your file permissions are correct. If you use WSL and you are working with Windows data (i.e. your working directory is somewhere in the Windows file system, not the WSL file system), it may not be possible to change the permission.

In that case, please copy the key to the WSL file system with:

mkdir -p ~/.ssh/
cp <path/to/your-key> ~/.ssh/
chmod 600 ~/.ssh/<your-key>
ls -l ~/.ssh/<your-key>

The last command is to verify the permissions of the key.

SSH FAQ

FAQ

Various questions and their answers are listed below. For SSH connection problems, see the SSH Troubleshooting page instead.

Is password authentication supported?

No, it is not because password based SSH authentication is less secure than public key based authentication.

With password based authentication, your machine must send the password to the SSH server. This means that if an attacker compromised the login node, they could capture the passwords of anyone attempting to login; which would be VERY BAD!.

With public key based authentication, the HPC login nodes only ever get your public key while the private key stays secret on your machine and is never transfered. This means that if an attacker compromised the login node, they would get NEITHER your password nor your SSH private key. Such an attack is still bad of course, but at least the attacker can’t log into people’s GWDG email accounts, wipe their OwnCloud shares, etc.

Remote development over SSH

Code on the cluster can be edited in several ways: directly on the login node using editors such as Vim, Emacs, or nano, or by mounting the remote file system on your local machine.

SSHFS

SSHFS enables you to mount a remote filesystem onto your local system.

$ sshfs user@login-node:/full-path-to-remote-directory/ path-to-local-mount-point

You can now access and edit files on your local mount point just as if they were local files.

Note

The trailing slash (/) on the remote directory is required; omitting it will produce a “Not a directory” error. Tilde (~) expansion is also unsupported, so use the full path (e.g., /user/uuser1/u12345).

Editors

Many code editors now provide remote‑development support over SSH, enabling you to edit files and execute tools on a remote machine directly from your local editor.

Vim

Vim can open remote files via scp.

$ vim scp://user@login-node/~/test.txt

Emacs

TRAMP is used to transparently access remote files and is included as part of Emacs (since Emacs version 22.1).

$ emacs /ssh:user@login-node/~/test.txt

Visual Studio Code

The Visual Studio Code Remote SSH extension lets you open and work with a folder located on a remote machine.

Warning

Please do not use this feature! VS Code initiates many server processes on the remote host, imposing a substantial load on the login nodes due to the inefficient way they handle filesystem access in particular. Consequently, the use of VS Code in this manner is under active review for future restriction.

What 2FA methods are available?

Currently, we have limited support for 2FA with SSH.

All login nodes of the NHR cluster (glogin[1-13].hpc.gwdg.de) support ed25519-sk and edcsa-sk keys, which use a compatible FIDO2 (hardware) security key for 2FA (not all FIDO2 security keys support it). In the future, all SCC login nodes will support them. If you want to use them, we strongly recommend the non-resident/non-discoverable variant of ed25519-sk (default for ssh-keygen -t ed25519-sk). Resident/discoverable variants require different care to secure. See the FAQ page on keys for more information.

Additional 2FA methods for SSH authentication may come in the future.

What type of user account do I have?

Already have an HPC-enabled account, but you are unsure what login nodes / partitions you should use? Please check out this page.

Why are only ed25519, ed25519-sk, and 4096-bit rsa keys recommended?

Why Are These Good Keys

ed25519 keys (and thus also ed25519-sk keys) are fast, considered secure at the present time (new research could of course change this, like all cryptographic methods), and small. Not all clients support ed25519 keys yet, so we recommend 4096-bit rsa keys as a fallback on clients that don’t support ed25519 keys. 4096-bit rsa keys are universally supported and considered safe at the present time (this size is thought to remain safe for quite a while). Their big downsides are that they are slower, the keys are larger, and implementations have to work hard to avoid certain implementation difficulties. See Generating SSH Keys for information on generating SSH keys.

The problems with other key types and smaller sizes of rsa keys are described in the sections below.

Problems with dsa Keys

They have been considered no longer secure for a long time. So much so in fact that OpenSSH deprecated them in 2015.

Problems with with ecdsa Keys

While the login nodes still support ECDSA (ecdsa and ecdsa-sk) keys for those who have already made them, we cannot recommend generating new ones. They are harder to implement without bugs than the better ed25519 and ed25519-sk keys and the providence of their special parameters is under question (unverified claims of the use of a hardware random number generator vs. the “nothing up my sleeve” parameters of ed25519). New login nodes will not have ecdsa host keys and the existing ecdsa host keys on the login nodes will be phased out.

Problems with rsa Keys Smaller than 4096 Bits

rsa keys smaller than 2048 bits are no longer considered safe. 2048 are considered safe for some years still and 3072 for longer yet, but who will REALLY remember to rotate their rsa keys to a larger size in a few years? Therefore, we recommend 4096 bits for new RSA keys. That size means they will be safe for quite a while.

Considerations on ed25519-sk and ecdsa-sk Keys

The ed25519-sk and ecdsa-sk keys are special ed25519 and ecdsa keys that use a compatible FIDO2 security key as 2FA (Two-Factor Authentication). This kind of 2FA is a very powerful security feature to protect against the SSH key being stolen (login requires something you know and something you have).

All NHR login nodes support them, but not all SCC login nodes do (support on SCC login nodes is coming soon). If using one of them, we recommend the ed25519-sk kind when possible due to the issues with ecdsa discussed above, though sadly some older security keys only support ecdsa-sk.

We strongly recommend the non-resident/non-discoverable variants of the keys (default for ssh-keygen). Resident/discoverable variants require different care to secure.

Using AI Services

Access to our services

There are several different entry points to our AI services. In general access is given through our web frontend, though you will need an Academic ID. You can find instructions on how to create an Academic ID on the kisski website.

However, a data processing contract must be concluded for your institution. An explanation of the various entry points and the contract documents can be found here.

Documentation resources

You will find detailed information on all our AI services in this documentation.

The most searched topics are:

Data Privacy Please note the difference between our self-hosted models and external models explained in the Data Privacy Notice.
an overview of all models
RAG-System Arcana
Personas
API-Keys

Our Community Strategy

You may not find every feature that large tech companies offer in their comparable solutions. However, we work on compensating for this with our community model. We rely on collaboration and exchange with our users to continuously improve and adapt our services to your needs. If something is missing or you miss a specific function, please let us know. Your feedback is important to us and helps us to improve our services.

We bundle, sort, and prioritize the requirements of our community to ensure that we use our resources effectively and meet the most important needs of our users. Your participation and feedback are therefore crucial for the further development of our services.

Introduction of AI Services in your Institution

For a successful introduction of AI services in your institution, we recommend that you appoint a contact person or even a project team to accompany the introduction and act as an interface to our team and community. The introduction support should deal with the following tasks:

Coordination of the introduction of AI services
Communication with our team
Identification of requirements and needs
Accompaniment of training and workshops

We recommend that your contact person or project team join our Chat AI community to exchange with other users and benefit from our experiences. Our community channels are:

Open Source and participation

Our Chat-AI service is Open Source. On GitHub, you can participate in the further development of the services, report bugs, suggest features, or clone the project to experiment with it yourself.

Special requirements

If you have special requirements for our AI services, please contact us. We are open to individual application cases and can often offer customized solutions.

Support

If you have problems, our support team is happy to help you. Open a ticket via mail. The tickets will be forwarded directly to the responsible experts. This way, you will receive the best answers to your specific questions through our support.

Thank you for your interest in our Chat-AI services! We look forward to your feedback and your participation in our community.

Institutional Access to AI Services

Description

In this section, you will find explanations on how to access the AI services in three main steps:

selecting the subject matter of the contract
selecting the deployment
limiting access

Note that you can always discuss with us customization and alternative options.

Contract Selection

Access to the Open-Weight-Models hosted by GWDG in Göttingen.
Orthogonal to this offer are external models that you can use but you need to pay for. Based on your choice, we have a different list of contracts and legal requirements that must be fulfilled.

Contracts for Open Weight models hosted by GWDG

Under Contract Documents Internal Models, you will find the necessary documents for using the Open Weight Models hosted by us in Göttingen. Please follow the steps in the instructions and forward the documents to the relevant departments at your institution for review and signature.

In order to use the AI-Basis services, the following documents must be completed, signed and returned:

performance agreement
a corresponding data processing contract (AVV)
Nomination of the Data Protection Officer
Form for organizational data

You can download the remaining attachments to your contract documents.

Contracts for optional access to external models

If you want to opt in for access to the external models you find the corresponding documents under Contract Documents Internal and External Models. The documents differ in order to additionally allow the use of external models. Access to the internal models is included.

Entry point from DFN

If you wish to access our AI services via our partnership with the DFN, you must also sign the corresponding contracts between you, the GWDG and the DFN. Please contact verkauf@gwdg.de.

Entry point from KI:Connect

The default approach for KI:Connect is to use open weight models via API endpoint. Legally there is NO need to make an AVV with us but it useful in case someone accidentally puts in GDPR personal data. Institutions that use access to external models via API as part of our partnership with KI:Connect require an additional API key. Please contact kisski-support@gwdg.de for this.

Deployment

Generally, it is your choice how to deploy our services:

Deployment 1) Using the existing Web interface for end users (such as https://chat-ai.academiccloud.de/)
Deployment 2) Integrating AI APIs into services you deploy locally (such as your own chat interface)

Deployment 1: Using existing web frontends

To use the AI services, it is generally possible to use our web interface. An Academic ID is required.

Deployment 2: Integrating AI APIs into YOUR services

With an Academic ID, you can also request an API key via the KISSKI website (click on “Book” on the page). You can use the API key in another webfrontend, plugin or your own code.

Limiting Access

Restricting Models

We provide a large assortment of Large Language Models (see Available Models. Available models are regularly upgraded as newer, more capable ones are released. Not all of these may be approved for use by your institution. In order to restrict the use to certain models, the authorized person of your institution must inform us in writing of the selection of models.

Restricting Model Access for specific groups

You may specify different user-groups within your institution by assiging functional affiliations (e.g. employee@university.de, student@university, researcher@university) and instructing us which models are available for each user group. In order to gain access to the IdM objects (accounts, groups, shared mailboxes and resources) of your institute or institution in the GWDG identity management system, it is necessary for us to set up special authorizations for the IdM portal for you. Access to the special authorizations forms can be found here.

Limiting external model access by quotas

With your order confirmation you can impose a finanical ceiling for the usage of the external models. As a result, total consumption can be reduced to the corresponding monetary value of this limit.

NHR Application Process

Tip

You’re a new user and just want access to the NHR resources for testing your software, learning how to use the systems or preparing a project proposal? We recommend to just proceed to the project application system JARDS and request an NHR Starter project (select NHR@Göttingen as the preferred site). Once your application reaches us, we will setup access and you will be notified about the next steps.

To become a user on the compute systems of NHR-NORD@Göttingen, you need to join an existing project or apply for a project (must be a member of a German university (Hochschule)). There are three ways to do so:

Be added to an existing project.
Apply for a test account with limited compute time (technically, you would be added to one of the test projects). You must be a postdoctoral member of a public German university (Hochschule) to be eligible or have consent from a person eligible for full project applications (see below) to be eligible.
Apply for a full project. You must be a member of a German university to be eligible.

Be Added to an Existing Project

One of the project’s PI/s and/or Delegates must add you to their project. For NHR/HLRN projects not yet imported into the new Project Portal, they must use the NHR/HLRN Management Portal. For projects created in or imported into the new Project Portal, they can log into the portal to add you. For open projects on the Project Portal, you can even click the project’s Join button to send them a request.

Apply for a Test Account

Test accounts only have limited compute time and storage. By default the limit is 75 000 CPU hours per quarter. You can request an increase of your limit up to a maximum of 300 000 CPU hours per quarter. Test account’s storage space is limited to just their HOME directory (small quota) and workspaces (workspaces have limited lifetimes). They are primarily suitable for testing the cluster and your workloads/codes, but can also be used for small compute projects with only a single user. If you need more storage space, but less than 300k CPUh/Q, or you have a group of people who would like to test our cluster to work on a common topic, please apply for a test project. For everything else (e.g. more compute and/or large amounts of dedicated storage), you need to apply for a full compute project. If you are unsure, please contact us for a consultation at nhr-support@gwdg.de.

The steps to get a test account are:

Read Apply for a User Account.
Log in to the HPC Project Portal (see Project Portal for more information). In most cases, your home institution will be automatically detected based on your mail address so federated login is possible. Otherwise, you can register a new account.
Request access to HPC resources by sending an email to nhr-support@gwdg.de. Please indicate if you are applying for a full project in parallel.

Once your request has been verified, you will be added to one of the generic testing projects (for each German state) and receive an email providing information and instructions on how to use your account. A test account expires after 9 months, but can be extended upon request. To extend a test account send an email to nhr-support@gwdg.de and include your user id (e.g. u123456).

Apply for a User Account

To apply for a user account / Nutzungskennung for the NHR please consider the following information, then proceed with the steps of the Application Process. The following limitations apply when using NHR resources:

The NHR regulations limit the access to NHR resources to scientists at German universities (see NHR Alliance, Computing Time / NHR-Verein, Rechnernutzung).
Due to export control, the NHR systems are not accessible for people originating from certain countries.

Holding a user account you can

perform test simulations to prepare a proposal for a compute project and
use the resources, such as compute time and storage quota, which are granted to projects (cf. Apply for a Compute Project).

Compute Time of a User Account

By holding a user account you are able to work on the compute systems at NHR-NORD@Göttingen, and to expend compute time.

The user account is attached to a personal account holding a limited amount of computing time, see Accounting for details.
With the user account you have access to the accounts of all compute projects in which the user account is a member in.

Apply for a Full Project

Full projects provide the following extra features over test accounts:

More compute time
Dedicated project storage space (see Storage Systems for more information)
Can add additional users to the project with shared access to the project’s storage and compute time via the Project Portal

To apply for NHR compute resources please

Read Apply for a User Account and check the restrictions
Apply for a compute project

Apply for a Compute Project

Tip

Application Types

You can apply for two different types of projects, to a Test Project and to a Compute Project with the following properties.

Application	Compute Time Grant	Compute time per quarter [coreh]	Review process	Call	link to application
Test Project	User account and enough compute time for smaller projects and/or preparation	<= 300k	Only a user account and a request to nhr-support@gwdg.de are required, cf. Application Process. N.B.: NHR Starter application preferred	rolling	User Account
Compute Project	Normal	300k - 5M	Scientific board of NHR@ZIB, NHR@Göttingen or Whitelist (simplified review by Scientific board, DFG/BMBF/NHR/GCS/EU grant required)	quarterly	JARDS `NHR Normal`
Compute Project	Large-scale (“Großprojekt”)	>=5 Mcoreh (CPUs) / >=6.25 kGPUh (GPU)	Scientific board + NHR panel (“Nutzungsausschuss”)	quarterly	JARDS `NHR Large`

NHR Starter vs Test projects

Please refer to the notice above on how to apply for an NHR Starter project. You will get access under the same conditions as with a Test Project. However, the additional metadata, such as your field of research, short project description and required software will help us to better understand your requirements.

Application Process

Project proposals for a Compute Project have to be submitted via the portal JARDS.

Our Guide to filling out offers assistance to go through JARDS.
One significant point on JARDS is to upload the main proposal document. To prepare this main proposal document we suggest to use the Proposal Format which is used for NHR@ZIB and NHR-NORD@Göttingen since both centers collaborate for the technical and scientific evaluation of compute time projects.
If you plan to use the NHR systems at both NHR centers (NHR@ZIB and NHR-Nord@Göttingen), it is recommended to submit two analogous proposals via JARDS. The technical review will then be done by each center and the scientific review is done jointly. Usually, the two proposals will then only have to differ in the amount of required resources and how they are mapped to the work packages. If you submit a pair of proposals like this, please make sure to state this under Remarks, including the mutual project title, unless the projects are named identically anyway.

Deadlines and schedule

Compute project proposals can be submitted via JARDS at any time. The review process takes place in the quarter corresponding to the submission deadline: January 15 (23:59:59 CET), April 15 (23:59:59 CEST), July 15 (23:59:59 CEST), and October 15 (23:59:59 CEST). These dates apply to NHR-Nord@Göttingen and extend the default NHR deadlines (first day of each quarter).

Approved projects begin at the first day of the quarter following the review phase and are granted a duration of 12 months.

Proposal Format

Please submit the complete project proposal for a compute project at the two NHR centers, NHR@ZIB and NHR@Göttingen, via JARDS. A project proposal contains four parts which are described in the sections below.

After submission, the proposal is reviewed by the external Scientific Board. Each reviewer obtains access to

the project proposal,
the history of previous projects,
contact data of the consultant, and
aggregated statistics for usage of computung time on the NHR systems.

Meta Data

During the course of online submission, the project proposer needs to provide some metadata. This includes information about

the principal investigator (scientist holding a PhD),
the person in charge of submitting the proposal,
project partners,
other evaluations and funding for the project,
usage of other compute resources,
project lifetime,
software requirements (please check availability with your consultant before),
requirements in units core hour (at least 1200 k core hour per year, see Accounting), and
storage requirements.

Main Proposal

The main proposal

needs to be uploaded as a prepared document in PDF format,
is written in English,
should be written in LaTeX based on the proposal_template_v1.5.1g.zip. This template includes all relevant aspects like project description, computational facets and resource estimation. The LaTeX template includes both proposal types: initial and follow-up.

We recommend to use the following layout:

Initial proposal: main_proposal_initial.pdf
Follow-up proposal: main_proposal_follow_up.pdf

The LaTeX template / PDF samples internally differentiate between a normal and a short version in case of whitelist status. For more details, look inside the template/samples.

Whitelisting

Whitelist status is granted by the scientific review board under two conditions:

First, you apply for a “Normal project” with less than 5M core-hours per quarter (see Apply for a Compute Project).
Secondly, you are part of an active DFG/BMBF/NHR/GCS or EU project, which explicitly describes HPC resource requirements. As evidence, you need to upload the corresponding project proposal and its review report to the online portal.

Please avoid:

Imprecise or incomprehensible estimation of requested computational resources (especially core-hours); for example, missing arguments to justify a certain number of N runs instead of a smaller number. One page is our recommended minimum.
No proof, that the software to be used is suitable and efficient for parallel execution (and parallel I/O) on our current HPC systems architecture. Recycling a scalability demo by a third party is meaningless, without showing that your planned production run is fully comparable to it (algorithm selection within the software, I/O pattern, machine architecture, problem size per core).
The overall aim and/or motivation behind the project is unclear.
The applicant lacks HPC/Unix skills and an experienced co-applicant is missing.
Insufficiency of the applicant’s local resources is not indicated.
The NHR was not mentioned in relevant publications.
Cut & paste previous/parallel proposals; instead, refer to these.

In case of questions, please contact your consultant.

Public Abstract

All present, compute projects that have been successfully reviewed by the Scientific Board are listed on the project list. Each proposal for a compute project needs to submit a public abstract in PDF format based on the public abstract template (English/German). It should be generally understandable.

The abstract is written in English or in German and should contain about 2 pages. If you have no project ID yet (in case of an initial proposal) simply keep the default: “abn12345”.

Signed Proposal Summary

By the end of the online submission process, a summary is generated by JARDS which has to be signed and then either sent to the Office of the Scientific Board or reuploaded to JARDS. This also indicates that your application is successfully submitted.

JARDS - Guide to filling out

Please start your application by choosing

NHR Normal, if you want to apply for less than 20 Mio core-hours/year. If you plan to apply for more than 20 Mio core-hours, choose NHR Large (if you apply only for GPU-hours, choose NHR-Large for more than 25 000 GPU-hours/year).

Please note: for “NHR Large”-projects, the “Whitelist” option is not applicable any more.

If you need a test account check out how to Apply for a Test Account.
At NHR@Göttingen every personal user account automatically can use computing time (at least 75 kcoreh/quarter).

Go on by choosing NHR@Göttingen

On this page (E-Mail Callback), please enter your e-mail. You will receive a mail with a link to start your application . This e-mail will be used for identification so please use your official e-mail and always the same.

Start the actual application by pressing “New NHR Project Application”
even if you apply for an extension !
(extension will be used in the future if you extend an existing jards-application)
That is, in the current phase there are only “New NHR Projekt Applications”.
If your project is already running at NHR@Göttingen, please provide your project id in the “Remarks” section at the end and also refer to it in the pdf.
YOU MIGHT USE THE SIMPLIFIED TEMPLATE FOR A FOLLOW-UP PROPOSAL – IN THAT CASE PLEASE ADDITIONALLY UPLOAD THE ORIGINAL APPLICATION AS SUPPORTING MATERIAL.

If you are Principal Investigator and Person of contact (PC and PI), you can check “Apply as both, PI and PC”, otherwise please provide the data of both persons.

Complete the data.

The first field on this page depends on whether you’re submitting an Initial Proposal (for a new project) or a Follow-Up Proposal (to apply for more resources and an extended runtime for an already existing project).

Do NOT enter a Project ID (it will be set automatically).

Please enter the Project ID, which is shown in the project management portal, for example.

“compute period” is (usually): 01.10 – 30.09 (1 year).
You might apply for a shorter period (in full quarters).

Select your DFG classification.
And answer if your project is a collaboration. Please provide all other relevant projects of your GROUP
(that is, projects with the same PI and in the same area).

Please enter the requested computing time in Million Core hours (CPU). Total first, then divided in quarters. Some questions about software etc. follow.

Please answer questions about your usage of Artificial Intelligence (AI) – if you do not use AI, you do not need to answer these questions. If you want to apply for GPU computing time, please enter the values in GPU-hours, that is hours of a single gpu (not a full node).

Also for GPU usage, some questions about the environment have to be answered.

Please enter your planned storage requests on the file systems Home, Work and Perm (this does NOT mean that this storage will be reserved for you). Bigger storage requests should be described even more precise.

These questions should be answered if you have “unusual” I/O requests.

You can upload only 3 PDF Files.

the “actual application”, created with the given template.
a public abstract (for our web server) - easy to understand.
All other documents, e.g. other reviews, publications etc., have to be combined in a single PDF. You might use free tools like PDF-Shuffler to do so. Then, please upload the combined PDF as “Supporting material” .

Please provide your NHR@Göttingen project id here if you have one, e.g. “bem00027”. You might submit any other remarks here.

Finally, you have to sign your application. After pressing “Finalize”, a summary to sign will be created for you.
You might upload it on jards or send it by e-mail to nhr-support@gwdg.de. Of course, you might also send it to:
Wissenschaftlicher Ausschuss der NHR@Göttingen - Geschäftsstelle -
c/o GWDG, Burckhardtweg 4, 37077, Göttingen

KISSKI Application Process

The KISSKI project combines many offerings. All of them are listed in the service catalogue on the KISSKI webpage. To become a user of the resources from KISSKI, you have to fill out the form linked behind the “book” button on the service page. You require an academic cloud account to register for the service, which is explained in the FAQ.

Smaller projects that request resources bellow 25.000 GPU-hours can be granted almost immediately, while larger projects require a more in-depth check of the proposal. A project can start anytime, if you need access right away you might request immediate access.

A request for consulting takes longer to be processed, due to the requirement for one of our employees to be available. Regardless of the starting time, personnel resources of up to 1 week can be granted for each project as stated in the FAQ.

Compute resources

In addition to personal and work information regarding your group, it is required to gather these information:

A short meaningful project acronym as well as a short Project description
Project runtime
Time window for the computations
Required hardware resources (in GPU-hours)
Required Storage in terms of number of files and total file size

Project and compute run time

A project can only run for a year and it may start anytime. Additionally, applying for an extension of the project can be done without a new application. For an extension, only an abstract about the work already done, an abstract about the work to be performed in the future and the extension duration is required. An extension is possible once, again up to a maximum of one year.

The start and end of the compute time can be different from the project run time. We also offer storage, meaning it is possible to apply for a year long project that only requires computations in the last months of the project. This can be useful, for example, if the project starts with data collection, and only the last phase of the project is computation evaluation. For now, this feature is not implemented in the HPC Project Portal but the form is already set up for this option.

Required hardware resources

The required hardware can be requested in terms of GPU-hours and the limit is 50.000 GPU-hours per project.

These numbers are calculated by accumulating the number of GPU and the time duration they are run for. Our other resources use core-hours for CPU and an example for the calculations can be found in our documentation. Keep in mind that one node contains 4 GPU and all of them count to the total for that node.

Comment	Nodes	Run time	Run time hours	GPU-hours
A small single node test run for a few days a week	1	24 hours * 4 days	96	384
A small single node run over one full week	1	24 hours * 7 days	168	672
A four node run for a week	4	24 hours * 7 days	168	2.688
A four node run for a month	4	24 hours * 30 days	720	11.520
A four node run for four months	4	24 hours * 120 days	2880	46.080

Required storage

For storage, we require information about the number of files you would like to store and what the total size (in GB) will be. The number of files are important because we have special storage solutions for the KISSKI project. One solution performs better for large files, while the other is better for smaller files. We can also offer workflows to optimise the storage depending on your needs.

An estimate for total storage needs is required because our storage capacity is limited and extremely large requests with many TB or even PB need to be processed separately. If you are planning a project that requires very large amounts of storage, please use our support and contact us before applying for the project.

Chat-AI

For the Chat service we additionally require you to state your desired access. You can choose between general use, API access, fine tuning and custom LLM hosting. Please, fill in the text box and let us know your specific requirements.

Accounting Core-h

Info

This accounting page is currently, only relevant for NHR users, because they are following the NHR-wide rules! The SCC users cannot use the sbalance script, but it will be added in the future once accounting as activated in that cluster.

The NHR centers follow the NHR-wide regulations and account the resource usage in the unit core hours. Using the batch system will result in the consumption of the core hour balance. Every node in each of the partitions has a specific charge rate. The current rates can be found in the table for the CPU partitions and also for the GPU partitions.

Using one node consumes the resources of the entire node and is accounted as a full node. A shared node can be used by multiple users and therefore, only the allocated number of cores are accounted. Using a GPU node only takes the GPU into account, meaning allocating and using a GPU node will only account the core hours for the GPU not the CPU.

Usage of the storage system is not accounted.

Job Charge

The charge for a batch job on the NHR systems is the number of core hours and is calculated from the number of nodes reserved for the job, the wallclock time used by the job, and the charge rate for the job nodes. For a batch job with

num nodes,
running with a wallclock time of t hours, and
on a partition with a charge rate charge_p

the job charge charge_j yields

charge_j = num * t * charge_p

Info

A job on 10 nodes running for 3 hours on partition huge96 (= 192 core hour charge rate per node) yields a job charge of 5760 core hours.

Batch jobs running in the partition large96:shared access a subset of cores on a node. For a reservation of cores, the number of nodes is the appropriate node fraction.

Info

A job on 48 cores on partition large96:shared (96 cores per node, 192 core hour charge rate per node) has a reservation for num = 48/96 = 0.5 nodes. Assuming a wallclock time of 3 hours yields a job charge of 288 core hour.

Checking the balance

We provide a script called sbalance. It prints your current balance and provides some additional useful information. Also, It differentiates between personal and project accounts.

usage: sbalance [-h] [-s] [-l] [-n] [--me] [--assoc] [-u USER] [-a ACCOUNT]
                [--ascii] [--no-color]

Shows the remaining core hours for a user-account association from the
SlurmDB.

optional arguments:
  -h, --help            show this help message and exit
  -s, --short           Only print the remaining core hours, nothing else.
  -l, --limit           Print the current limit instead of remaining core
                        hours. (Requires the -s/--short flag)
  -n, --no-help         Don't print the info texts.
  --me                  Prints the remaining core hours for the current user
                        (same as running "sbalance" without any flag).
  --assoc               Print the remaining core hours for the user-account
                        associations, instead of for the accounts.
  -u USER, --user USER  Specify the username to query.
  -a ACCOUNT, --account ACCOUNT
                        Specify the account to query.
  --ascii               Force use of only ASCII characters (no unicode).
  --no-color            Disable color even if stdout would support it.

The output can look like this:

sbalance output variations

Example output of the sbalance command with color and unicode. See the ascii tab for an alternative view. — Example of the sbalance command with color and unicode

Personal contingent for account           :
  Used 0.00% (0 core hours / 71.56 kilo-core hours)
    [----------------------------------------------------------------------------]
  Your personal contingent will be renewed on the 01.07.2024
  You can also apply for a project to gain more core hours. Use the project-
  portal (https://hpcproject.gwdg.de) to do so.

Project accounts: 
  You are currently in no projects
  If you want to be added to a project, please ask your supervisor to add you to
  one in the project portal (https://hpcproject.gwdg.de). You can also use the
  project portal to submit a project application yourself.

Account Types

Accounting is done for two different types of accounts. The NHR centers distinguish between personal accounts and project accounts. Personal accounts always have some small recurring amount of core hours while a project account hold much more core hours and can be distributed between many users.

Personal Account

At the beginning of each quarter, each account is granted 75.000 core hours. In reasonable and exceptional cases, the grant of the account can be extended to 300.000 core hours per quarter. This can be particularly useful if you need more time for estimating core hours consumption for a project proposal. In order to increase your core hours, please contact the NHR Support. At the end of each quarter all remaining core hours in the bank account are reset.

Project Account

NHR Projects

A compute project holds a bank account for the project. This project account contains a compute capacity in core hours. At the beginning of each quarter the account is granted by the number of core hours following the funding decision for the given compute project. A project account holds at least 4 x 300.000 core hours per year. Unused core hours are transferred to the subsequent quarter, but only one time.

In case of problems with your compute capacity in core hours in your project account please contact the NHR Support. This might affect the

application for additional core hours,
movement of core hours between quarters.

KISSKI projects

KISSKI projects are accounted in GPU hours. Without going into detail, the default limit for a project is 25.000 GPU hours, which is about 3.750.000 core-hours. The maximum that can be requested with the submission is 50.000 GPU hours. Using the table in GPU partitions, it is possible to calculate the associated core-hours for your project. An example for such a calculation can be found here.

Select the Account in Your Batch Job

Batch jobs are submitted by a user account to the compute system. For each job the user chooses the account (personal or project) that will be charged by the job.

At the beginning of the lifetime of the user account the default account is the personal account.
The user controls the account for a job using the Slurm option --account at submit time.

Info

To charge the account myaccount add the following line to the job script.
#SBATCH --account=myaccount

After job script submission the batch system checks the account for account coverage and authorizes the job for scheduling. Otherwise the job rejected, please notice the error message:

Info

You can check the account of a job that is out of core hour.

> squeue
... myaccount ... AccountOutOfNPL ...

For HPC Project Portal accounts this is neither necessary nor useful, they will always use the account of the associated project.

Terminology

The unit used is the number of hours and cores used. It can be expressed as:

core hours
coreh
kcoreh = kilo core hours = 1000 core hours
Mcoreh = mega core hours = 1000 000 core hours

Using the SI abbreviations is useful as a short hand and used in some print outs

Science Domains

Under the science domains heading, we combine domain-specific information regarding software, training, services, and other helpful links. Therefore, each domain is managed by a colleague with a background in the respective domain. The goal is to support users of a particular field and to collaborate on the intersection of HPC and a science domain. We are always happy to support your research and start research projects together.

This documentation focuses on highlighting technical tips and tricks on our HPC system that supports different science domains. A more general description of the different domains is provided on our community webpage.

Below, you can find and select a science domain that you are interested in. Feel free to contact us if something needs to be added for a specific domain.

AI

Description

Artificial intelligence became an importance tool for many scientific domains. Various problems are tackled using methods from both classical machine learning and modern deep learning.

Access

You can directly request access to our compute resources, either from NHR or the KISSKI project, which is dedicated to this work. All required information can be found under getting an account.

You can also use the chat AI service directly via KISSKI

Applications

Current available modules

Support

Request an API key for the SAIA service from KISSKI.

Chemistry

Description

Computation in chemistry can be done with various software packages, many of which are installed in our software stack.

Access

You can get access to our NHR resourced by following the NHR application process.

Applications

All the applications that can be used for Chemistry listed

Current available modules

CP2K — A package for atomistic simulations of solid state, liquid, molecular, and biological systems offering a wide range of computational methods with the mixed Gaussian and plane waves approaches. exciting — a full-potential all-electron code, employing linearized augmented planewaves (LAPW) plus local orbitals (lo) as basis set.
Gaussian — a computational chemistry application provided by Gaussian Inc https://gaussian.com/.
GPAW — a density functional theory Python code based on the projector-augmented wave method.
GROMACS — a versatile package to perform molecular dynamics for systems with hundreds to millions of particles.
NAMD — a parallel, object-oriented molecular dynamics code designed for high-performance simulations of large biomolecular systems using force fields.
Octopus — a software package for density-functional theory (DFT), and time-dependent density functional theory (TDDFT)
Quantum ESPRESSO — an integrated suite of codes for electronic structure calculations and materials modeling at the nanoscale, based on DFT, plane waves, and pseudopotentials.
RELION — REgularised LIkelihood OptimisatioN is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy(cryo-EM)
TURBOMOLE — TURBOMOLE is a computational chemistry program that implements various quantum chemistry methods (ab initio methods). It was initially developed at the University of Karlsruhe.
VASP — a first-principles code for electronic structure calculations and molecular dynamics simulations in materials science and engineering.
Wannier90 — A program that calculates maximally-localised Wannier functions.
ASE – The Atomic Simulation Environment – a set of tools and Python modules for setting up, manipulating, running, visualizing, and analyzing atomistic simulations.
LAMMPS – A parallel, classical potential molecular dynamics code for solid-state materials, soft matter and coarse-grained or mesoscopic systems.
TURBOMOLE – A program package for ab initio electronic structure calculations. (see “module avail turbomole” on the SMP cluster)
NWChem – A general computational chemistry code with capabilities from classical molecular dynamics to highly correlated electronic structure methods, designed to run on massively parallel machines.
MOLPRO – A comprehensive system of ab initio programs for advanced molecular electronic structure calculations.
PLUMED – A tool for trajectory analysis and plugin for molecular dynamics codes for free energy calculations in molecular systems
CPMD – A plane wave / pseudopotential implementation of DFT, designed for massively parallel ab-initio molecular dynamics.
BandUP – Band unfolding for plane wave based electronic structure calculations.
libcurl — curl - a tool for transferring data from or to a server
libz — A Massively Spiffy Yet Delicately Unobtrusive Compression Library
nocache — nocache - minimize caching effects in lustre filesystems
texlive – LaTeX distribution, typesetting system
git – A fast, scalable, distributed revision control system

Support

Data Science

Description

Data science is a broad field and we offer a wide variety of software to accomodate as many of the users needs as possible

Access

You can get access to our NHR resourced by following the NHR application process.

Applications

Current available modules

AEC library — Adaptive Entropy Coding library
CDO — The Climate Data Operators
ECCODES — ECMWF application programming interface
HDF5 Libraries and Binaries — HDF5 - hierarchical data format
libtiff — A software package containing a library for reading and writing _Tag Image File Format_(TIFF), and a small collection of tools for simple manipulations of TIFF images
NCO — The NetCdf Operators
netCDF — Network Common Data Form
Octave — Add short Excerpt. This will be included in the Software list
pigz — A parallel implementation of gzip for modern multi-processor, multi-core machine
PROJ — Cartographic Projections Library
R — R - statistical computing and graphics
Szip — Szip, fast and lossless compression of scientific data
UDUNITS2 — Unidata UDUNITS2 Package, Conversion and manipulation of units
Boost – Boost C++ libraries
CGAL – The Computational Geometry Algorithms Library
nocache — nocache - minimize caching effects in lustre filesystems
texlive – LaTeX distribution, typesetting system
git – A fast, scalable, distributed revision control system

Support

Digital Humanities

Description

Digital Humanities is a broad field and we offer a wide variety of software to accommodate as many of the users needs as possible

Access

You can request access to these services via SCC, NHR or KISSKI and this is explained here.

Applications

Current available modules

Support

Earth Science

Description

Earth science is a broad field, and we offer a wide variety of software to accommodate as many of the users needs as possible

Access

You can get access to our NHR resourced by following the NHR application process.

Applications

Current available modules

Support

Engineering

Description

Many engineering challenges involve solving multiple equations which varying numbers of constants and physical properties. To make this manageable, many software packaged exists and we offer support for many of them.

Access

You can get access to our NHR resourced by following the NHR application process.

Applications

Current available modules

Ansys Suite — The full Ansys Academic Multiphysics Campus Solution is available, e.g. Mechanical, CFX, Fluent, LS-Dyna, Electronic, SCADE (but not Lumerical, GRANTA).
Foam-extend — The Open Source CFD Toolbox
OpenFOAM — An object-oriented Computational Fluid Dynamics(CFD) toolkit
STAR-CCM+ — A Package for Computational Fluid Dynamics Simulations
ParaView — An interactive data analysis and visualisation tool with 3D rendering capability
git – A fast, scalable, distributed revision control system

Support

Forest

Forests play a crucial role in our ecosystem and are, therefore, being researched on a large scale. Modern approaches are based on big data (e.g., analyzing satellite imagery or single-tree-based simulations). These approaches require significant computing resources, which we can provide with our HPC systems.

Our general motivation and topics of interest can be found on our community page. This page describes tools, data, and computing resources that we believe can be useful based on our experience. If you work in forest science and have ideas or requests, please do not hesitate to contact us via our ticket system or the responsible person listed on the community page.

Forest workloads often fit the workflow of a typical HPC job. Therefore, it is recommended that you start with our general documentation on how to use the cluster.

Following are some of our services and tools highlighted, which we think are useful to consider when doing forest research on our HPC system.

JupyterHub

JupyterHub offers an interactive way of using our HPC resource, which most researchers use. Besides the typical usage of Jupyter notebooks, starting an RStudio server or a virtual desktop is possible. Further, setting up your own container with the specific software that fulfills your requirements is possible. This way, sharing your setup with project partners is also easy. This way, it is possible to run your workloads with CPU and GPU support.

Metashape

Orthomosaic Generation with Metashape is a workflow we developed for the project ForestCare.

SynForest

SynForest is a tool that generates realistic large-scale point clouds of forests by simulating the lidar scanning process from a stationary or moving platform and capturing the shape and structure of realistic tree models.

Data Pools

Data pools are a simple way to share your data on our HPC system with others. Further, data pools offer access to large data sets that are useful for different user groups over a longer period of time.

Workshops

The GWDG Academy offers workshops on different topics relevant to forest science on HPC. Most of them handle technical topics without examples from the field of forest science. The following are some workshops that include forest science workflows.

Deep learning with GPU cores

As different AI methods are applied in forest science, the workshop Deep learning with GPU cores provides a starting point for your AI workload on our HPC system. The workshop covers tree species classification of segmented point clouds with PointNet. In the section Deep Learning Containers in HPC a container is set up, that provides all required software for the workflow and can be used interactively on JupyterHub or via SLURM on our HPC cluster.

Performance of AI Workloads

Based on the workshop “Deep learning with GPU cores,” we introduce different tools and approaches to optimizing job performance. This can help reduce your waiting time and the energy consumption of your analysis.

Forests in HPC

Forests in HPC introduces different options for using the HPC system for forest applications. Here, we focus on lidar data analysis with R (lidR) and the same deep learning workflow that is used in the workshop Deep learning with GPU cores.

Life Science and Bioinformatics

Description

Life science benefits greatly from the application of computational resources and tools, from large-scale genomic analysis and complex molecular simulations, to sophisticated statistical analysis and image post-processing. We provide access to many bioinformatics and life science tools, the compute power to utilize them, as well as the technical expertise to help researchers access our resources and services.

Access

You can request access to these services via SCC, NHR or KISSKI as explained here.

Applications

The software stack available in our compute clusters is comprehensive, and consequently rather complex. You can find (many) more details in the software stacks page, and the many subpages of list of modules. In this page we will briefly mention how you can install your own programs if need be, and also highlight the more relevant domain programs (some of them with their own, more in-depth pages).

Installing your own programs

Users do not have root permissions, which can make installing your own software a bit more complicated than in your own computer. But still, there are a number of possibilities to achieve your own software stack:

Compiling from source: This is usually the hardest approach, but should work in most cases. You will have to take care of many dependencies on your own. Most Linux-based programs will follow a configure, make, make install loop, so if you have done this once for a program you can expect a similar procedure for future programs. The only note here is that at the configure step, you will have to define an installation directory that is somewhere in your user space.
pip: For Python packages, pip is usually the way to go. Make sure the pip you are using matches the python compiler you are actually using (which python and which pip should point to similar locations). To make absolutely sure, you can also use python -m pip install instead of raw pip. Sometimes adding the --user option is necessary to tell pip to install packages to a local folder. Finally, there can sometimes be incompatibilities between packages when using pip, and Python’s module system can be very complex, so we recommend using conda (see next point) or its variants.
conda: conda is a package and environment manager, particularly popular in the Bioinformatics domain, that makes handling complex environments and dependency stacks easy and transferable. We have a dedicated page with more information about python package management. Do be aware we discourage using conda init when setting up the package manager for the first time, please check our documentation for more details on how to properly use this tool.
spack: Spack is another package manager, similar to Conda. It can be more complex to use, but at the same time much more powerful. We provide information on how users can make use of Spack for installing their own software in its dedicated page.
Containers: Containers are a good option for particularly problematic programs with complex requirements, are very transferable, and are increasingly provided as an option directly from developers. We offer access to Apptainer on our cluster, and some of our tools are provided as containers already.
Jupyter-Hub: Our Jupyter-Hub offers the possibility to run your own custom containers in interactive mode, which can also be reused as containers for batch jobs. This includes Jupyter and Python containers, RStudio (you can also install extra modules on top of the base Python and RStudio containers without having to modify the container itself), and even containers for graphical applications on the HPC-Desktops.
Cluster-wide installation: In some cases, we can install software for the whole cluster in our module system, depending on how useful and widely used we believe it would be. Contact us at our usual support addresses if you require this.

Genomics

ABySS: De-novo, parallel, paired-end sequence assembler that is designed for short reads.
bedtools: Tools for a wide-range of genomics analysis tasks. - BLAST-plus: Basic Local Alignment Search Tool.
Bowtie: Ultrafast, memory-efficient short read aligner for short DNA sequences (reads) from next-gen sequencers.
bwa: Burrow-Wheeler Aligner for pairwise alignment between DNA sequences.
DIAMOND: Sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data.
GATK: Genome Analysis Toolkit Variant Discovery in High-Throughput Sequencing Data.
HISAT2: fast and sensitive alignment program for mapping next-generation sequencing reads (whole-genome, transcriptome, and exome sequencing data) against the general human population (as well as against a single reference genome).
IGV: The Integrative Genomics Viewer is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets.
IQ-TREE: Efficient software for phylogenomic inference.
JELLYFISH: A tool for fast, memory-efficient counting of k-mers in DNA.
Kraken2: System for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.
MaSuRCA: Whole genome assembly software. It combines the efficiency of the de Bruijn graph and Overlap-Layout-Consensus (OLC) approaches.
MetaHipMer (MHM): De-novo metagenome short-read assembler.
MUSCLE: Widely-used software for making multiple alignments of biological sequences.
OpenBLAS: An optimized BLAS library.
RepeatMasker: Screen DNA sequences for interspersed repeats and low complexity DNA sequences.
RepeatModeler: De-novo repeat family identification and modeling package.
revbayes: Bayesian phylogenetic inference using probabilistic graphical models and an interpreted language.
Salmon: Tool for quantifying the expression of transcripts using RNA-seq data.
samtools: Provides various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.

Additionally, we offer some tools as containers (which as such might not appear explicitly in the module system), among them: agat, cutadapt, deeptools, homer, ipyrad, macs3, meme, minimap2, multiqc, Trinity, umi_tools. See the tools in containers page for more information.

Molecular Simulations

Classic molecular simulations: LAMMPS, GROMACS, NAMD.
Various ab-initio codes: Gaussian, Turbomole, Quantum Espresso, CP2K, CPMD, Psi4, VASP (own license required).
Alphafold and other protein folding programs: See for example the Protein-AI service.

Imaging

Freesurfer: an open source software suite for processing and analyzing brain MRI images.
RELION: (REgularised LIkelihood OptimisatioN, pronounce rely-on) Empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).

Workflow Tools

Snakemake.
Nextflow.

Other

R, Matlab, Python, Octave, and many other programming languages.
Environment and package handling tools:
- Conda, Miniforge, uv: see Python.
- For other software: Spack.
Neuromorphic computing tools.

Support

For ways to contact us, see support.

Mathematics

Description

Numerics related to mathematics can be done with many multi-purpose languages for which we offer support. In addition, we also offer many software packages with more specialized use cases.

Access

You can get access to our NHR resourced by following the NHR application process.

Applications

Current available modules

BLAS (Basic Linear Algebra Subprograms) — A routine that provides standard building blocks for performing basic vector, matrix operations and the development of high-quality linear algebra software.
FFTW3 — A C-subroutine library for computing discrete Fourier transforms in one or more dimensions, of arbitrary input size, and of both real and complex data.
Git – A fast, scalable, distributed revision control system.
GSL (GNU Scientific Library) — A numerical library for C and C++ programmers. The library provides various mathematical routines such as random number generators, special functions, and least-squares fitting.
Matlab – A universal interactive numerical application system with advanced graphical user interface.
METIS – A set of serial programs for partitioning graphs, partitioning finite element meshes, and producing fill-reducing orderings for sparse matrices.
MUMPS (MUltifrontal Massively Parallel sparse direct Solver) — A package for solving systems of linear equation based on a multifrontal approach.
NFFT (Nonequispaced Fast Fourier Transform) — A C subroutine library for computing the Nonequispaced Discrete Fourier Transform (NDFT) and its generalisations in one or more dimensions, of arbitrary input size, and of complex data.
Octave – A high-level language, primarily intended for numerical computations.
ParMETIS – An MPI-based parallel library that implements a variety of algorithms for partitioning unstructured graphs, and meshes, and for computing fill-reducing orderings of sparse matrices.
PETSc – A Portable, Extensible Toolkit for Scientific Computation: widely used parallel numerical software library for partial differential equations and sparse matrix computations.
R – A language and environment for statistical computing and graphics that provides various statistical and graphical techniques: linear and nonlinear modeling, statistical tests, time series analysis, classification, clustering, etc.
ScaLAPACK — A library of high-performance linear algebra routines for parallel distributed memory machines. It solves dense and banded linear systems, least squares problems, eigenvalue problems, and singular value problems.
Scotch — A software package and libraries for sequential and parallel graph partitioning, static mapping, sparse matrix block ordering, and sequential mesh and hypergraph partitioning.

Support

Physics

Description

Numerics for computational physics can be done with many multi-purpose languages for which we offer support. In addition, we also offer software packages with more specialized use cases.

Access

You can get access to our NHR resourced by following the NHR application process.

Applications

Current available modules

Ansys Suite - The full Ansys academic Multiphysics Campus Solution is available, e.g. Mechanical, CFX, Fluent, LS-Dyna, Electronic, SCADE (but not Lumerical, GRANTA).
BLAS (Basic Linear Algebra Subprograms) — A routine that provides standard building blocks for performing basic vector, matrix operations and the development of high-quality linear algebra software.
FFTW3 — A C-subroutine library for computing discrete Fourier transforms in one or more dimensions, of arbitrary input size, and of both real and complex data.
Foam-extend — The Open Source CFD Toolbox
Git – A fast, scalable, distributed revision control system.
GSL (GNU Scientific Library) — A numerical library for C and C++ programmers. The library provides various mathematical routines such as random number generators, special functions, and least-squares fitting.
Matlab – A universal interactive numerical application system with advanced graphical user interface.
METIS – A set of serial programs for partitioning graphs, partitioning finite element meshes, and producing fill-reducing orderings for sparse matrices.
MUMPS (MUltifrontal Massively Parallel sparse direct Solver) — A package for solving systems of linear equation based on a multifrontal approach.
NFFT (Nonequispaced Fast Fourier Transform) — A C subroutine library for computing the Nonequispaced Discrete Fourier Transform (NDFT) and its generalisations in one or more dimensions, of arbitrary input size, and of complex data.
Octave – A high-level language, primarily intended for numerical computations.
OpenFOAM — An object-oriented Computational Fluid Dynamics(CFD) toolkit
ParaView — An interactive data analysis and visualisation tool with 3D rendering capability
ParMETIS – An MPI-based parallel library that implements a variety of algorithms for partitioning unstructured graphs, and meshes, and for computing fill-reducing orderings of sparse matrices.
PETSc – A Portable, Extensible Toolkit for Scientific Computation: widely used parallel numerical software library for partial differential equations and sparse matrix computations.
R – A language and environment for statistical computing and graphics that provides various statistical and graphical techniques: linear and nonlinear modeling, statistical tests, time series analysis, classification, clustering, etc.
ScaLAPACK — A library of high-performance linear algebra routines for parallel distributed memory machines. It solves dense and banded linear systems, least squares problems, eigenvalue problems, and singular value problems.
Scotch — A software package and libraries for sequential and parallel graph partitioning, static mapping, sparse matrix block ordering, and sequential mesh and hypergraph partitioning.

Support

Contact our Support addresses.

Quantum Computing

Description

Quantum computing is a cutting-edge field that leverages the principles of quantum mechanics to process information in fundamentally new ways. Unlike classical computers, which use bits (0s and 1s) for computation, quantum computers use qubits, which can exist in a superposition of states. This enables quantum computers to perform certain types of calculations exponentially faster than classical machines. Quantum computing has the potential to revolutionize multiple industries by solving problems that are difficult or impossible for classical computers. Currently, we do not have our own quantum computer and only provide access to the quantum simulators via containers.

Access

You can request access to this service via SCC or NHR resources as explained here.

Applications

Current available simulators

Support

How to use...

In this section, users can find basic information and ‘How-to’ guides for the GWDG HPC systems.

We recommend to start with Slurm, the batch system that gives you access to the HPC resources in various compute partitions. If you are building your own software from source code, you will find the description of pre-installed compilers and examples useful. Usually, the software itself is installed via Spack already and made available in the form of modules. To get the best performance, an informed choice about the most suitable storage system is essential. You will also find instructions on how to handle data transfers and use terminal multiplexers for long-running interactive sessions. Last but not least the section about SSH explains how to configure your own client to access the HPC systems remotely.

Services

Some of our services and software packages have dedicated how-to pages:

Slurm

This page contains all important information about the batch system Slurm, that you will need to run software. It does not explain every feature Slurm has to offer. For that, please consult the official documentation and the man pages.

Submission of jobs mainly happens via the sbatch command using a jobscript, but interactive jobs and node allocations are also possible using srun or salloc. Resource selection (e.g. number of nodes or cores) is handled via command parameters, or may be specified in the job script.

Partitions

To match your job requirements to the hardware you can choose among various partitions. Each partition has its own job queue. All available partitions and their corresponding walltime, core number, memory, CPU/GPU types are listed in Compute node partitions.

Parameters

Parameter	SBATCH flag	Comment
# nodes	`-N <minNodes[,maxNodes]>`	Minimum and maximum number of nodes that the job should be executed on. If only one number is specified, it is used as the precise node count.
# tasks	`-n <tasks>`	The number of tasks for this job. The default is one task per node.
# tasks per node	`--ntasks-per-node=<ntasks>`	Number of tasks per node. If `-n` and `--ntasks-per-node` is specified, this options specifies the maximum number tasks per node. Different defaults between mpirun and srun.
partition	`-p <name>`	Specifies in which partition the job should run. Multiple partitions can be specified in a comma separated list. Example: standard96 ; check Compute Partitions.
# CPUs per task	`-c <cpus per task>`	The number of cpus per tasks. The default is one cpu per task.
Wall time limit	`-t hh:mm:ss`	Maximum runtime of the job. If this time is exceeded the job is killed. Acceptable formats include “minutes”, “minutes:seconds”, “hours:minutes:seconds”, “days-hours”, “days-hours:minutes” and “days-hours:minutes:seconds” (example: `1-12:00:00` will request 1 day and 12 hours).
Memory per node	`--mem=<size[units]>`	Required memory per node. The Unit can be one of “[K\|M\|G\|T]”, but default is M. If the processes exceed the limit, it will be killed.
Memory per CPU	`--mem-per-cpu=<size[units]>`	Required memory per task instead of node. `--mem` and `--mem-per-cpu` are mutually exclusive.
Memory per GPU	`--mem-per-gpu=<size[units]>`	Required memory per gpu instead of node. `--mem` and `--mem-per-gpu` are mutually exclusive.
Mail	`--mail-type=ALL`	See sbatch manpage for different types.
Project/Account	`-A <project>`	Specify project for NPL accounting. This option is mandatory for users who have access to special hardware and want to use the general partitions.
Output File	`-o <file>`	Store the job output in `file` (otherwise written to `slurm-<jobid>`). `%J` in the filename stands for the jobid.

Job Scripts

A job script can be any script that contains special instructions for Slurm at the top. Most commonly used forms are shell scripts, such as bash or plain sh. But other scripting languages (e.g. Python, Perl, R) are also possible.

#!/bin/bash
 
#SBATCH -p medium
#SBATCH -N 16
#SBATCH -t 06:00:00
 
module load openmpi
srun mybinary

The job scripts have to have a shebang line (i.e. #!/bin/bash) at the top, followed by the #SBATCH options. These #SBATCH comments have to be at the top, as Slurm stops scanning for them after the first non-comment, non-whitespace line (e.g. an echo, variable declaration or module load in this example).

Important Slurm Commands

The commands normally used for job control and management are

Job submission:
sbatch <jobscript> srun <arguments> <command>
Job status of a specific job:
squeue -j <jobID> for queues/running jobs
scontrol show job <jobID> for full job information (even after the job finished).
Job cancellation:
scancel <jobID>
scancel -i --me cancel all your jobs (--me) but ask for every job (-i)
scancel -9 send kill SIGKILL instead of SIGTERM
Job overview:
squeue --me to show all your jobs Some useful options are: -u <user>, -p <partition>, -j <jobID>. For example squeue -p standard96 will show all jobs currently running or queued in the standard96 partition.
Estimated job start time:
squeue --start -j <jobID>
Workload overview of the whole system: sinfo (esp. sinfo -p <partition> --format="%25C %A") squeue -l

Job Walltime

It is recommended to always specify a walltime limit for your jobs using the -t or --time parameters. If you don’t set a limit, a default value that is different for each partition is chosen. You can display the default and maximum time limit for a given partition by running:

$ scontrol show partition standard96
[...]
   DefaultTime=2-00:00:00 DisableRootJobs=NO ExclusiveUser=NO ExclusiveTopo=NO GraceTime=300 Hidden=NO
   MaxNodes=256 MaxTime=2-00:00:00 MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED MaxCPUsPerSocket=UNLIMITED

In this example, the default as well as maximum time is 2 days. As a rule of thumb, the shorter your job’s requested runtime, the easier it is to schedule and the less waiting time you will have in the queue (if the partition does not have idle nodes). As it can be difficult or impossible to predict how long a given workload will actually take to compute, you should always add a bit of a buffer so your job does not end up being killed prematurely, but it is beneficial to request a walltime close to the actual time your jobs will take to complete. For more information and how to increase priority for short jobs or get access to longer walltimes, see Job runtimes and QoS.

Using the Shared Nodes

We provide various partitions in shared mode, so that multiple smaller jobs can run on a single node at the same time. You can request a number of CPUs, GPUs and memory and should take care that you don’t block other users by reserving too much of one resource. For example, when you need all or most of the memory one node offers, but just a few CPU cores, the other cores become effectively unusable for other people. In those cases, please either use an exclusive (non-shared) partition, or request all resources a single node offers (and of course if possible, try to utilize all of them).

The maximum walltime on the shared partitions is 2 days.

This is an example for a job script using 10 cores. As this is not an MPI job, srun/mpirun is not needed.

#!/bin/bash
#SBATCH -p large96:shared
#SBATCH -t 1-0 #one day
#SBATCH -n 10
#SBATCH -N 1

module load python
python postprocessing.py

This job’s memory usage should not exceed 10 * 4096 MiB = 40 GiB. 4096 is the default memory per CPU for the large96:shared partition, you can see this value (which is different for each partition) by running:

$ scontrol show partition large96:shared
[...]
   DefMemPerCPU=4096 MaxMemPerNode=UNLIMITED

Advanced Options

Slurm offers a lot of options for job allocation, process placement, job dependencies and arrays and much more. We cannot exhaustively cover all topics here. As mentioned at the top of the page, please consult the official documentation and the man pages for an in depth description of all parameters.

Job Arrays

Job arrays are the preferred way to submit many similar jobs, for instance, if you need to run the same program on several input files, or run it repeatedly with different settings or parameters. The behavior of your applications inside these jobs can be tied to Slurm environment variables, e.g. to tell the program which part of the array they should process. More information can be found here.

Internet Access within Jobs

It’s not recommended to use an internet connection on the compute nodes, but it is possible if required. Access can be enabled by specifying -C inet or --constraint=inet in your Slurm command line or in a batch script.

srun --pty -p standard96s:test -N 1 -c 1 -C inet /bin/bash
curl www.gwdg.de

#!/bin/bash
#SBATCH -p standard96s:test
#SBATCH -N 1
#SBATCH -c 1
#SBATCH --constraint=inet

curl www.gwdg.de

Multiple programs and multiple data

Using multiple programs on different data within a single job takes a bit ofset up, as you need to tell the MPI starter exactly what to run and where to run it.

Jobscript

Example script hello.slurm for a code with two binaries

one OpenMP binary hello_omp.bin running on 1 node, 2 MPI tasks per node and 4 OpenMP threads per task,
one MPI binary hello_mpi.bin running on 2 nodes, 4 MPI tasks per node.

#!/bin/bash
#SBATCH --time=00:10:00
#SBATCH --nodes=3
#SBATCH --partition=medium:test
 
module load impi
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=4
 
scontrol show hostnames $SLURM_JOB_NODELIST | awk '{if(NR==1) {print $0":2"} else {print $0":4"}}' > machines.txt
mpirun -machine machines.txt -n 2 ./hello_omp.bin : -n 8 ./hello_mpi.bin

GPU Usage

Step-by-Step guide

If you would like an overview guide on this topic, we have a Youtube playlist to set up and run an example deep learning workflow. You can follow along our step-by-step guide. Note: The content of this guide is outdated.

Partitions and Hardware

There are multiple partitions with GPU nodes available. You will only have access to some of them, depending on what type of user account you have. An overview of all nodes with GPUs can be displayed by running the following command from a frontend node:

sinfo -o "%25N  %5c  %10m  %32f  %10G %18P " | grep gpu

For more details, see GPU Partitions.

Which Frontend to Use

Of the different login nodes available, we have two (glogin9 and 10, it is recommended to use the DNS alias glogin-gpu.hpc.gwdg.de) dedicated to run GPU workloads. These are the closest match to the hardware of our GPU nodes, so it is strongly recommended to use them when writing and submitting GPU jobs and especially when compiling software. While it is technically possible to do so from the other login nodes as well, it is not recommended and may cause problems. For example, a lot of GPU-specific software modules are not even available on other login (or compute) nodes.

Getting Access

Nodes can be accessed using the respective partitions. On the shared partitions (see GPU Partitions), you have to specify how many GPUs you need with the -G x option, where x is the number of GPUs you want to access. If you do not use MPI, please use the -c #cores parameter to select the needed CPUs. Note that on non-shared partitions, your jobs will use nodes exclusively as opposed to sharing them with other jobs. Even if you request less than i.e. 4 GPUs, you will still be billed for all GPUs in your reserved nodes! Note that on the partitions where the GPUs are split into slices via MIG, -G x requests x slices. To explicitly request an 80GB GPU please add the option --constraint=80gb to your jobscript.

Example

the following command gives you access to 32 cores and two A100 GPUs:

srun -p grete:shared --pty -n 1 -c 32 -G A100:2 bash

If you want to run multiple concurrent programs, each using one GPU, here is an example:

#!/bin/bash
#SBATCH -p grete
#SBATCH -t 12:00:00
#SBATCH -N 1

srun --exact -n1 -c 16 -G1 --mem-per-cpu 19075M  ./single-gpu-program &
srun --exact -n1 -c 16 -G1 --mem-per-cpu 19075M  ./single-gpu-program &
srun --exact -n1 -c 16 -G1 --mem-per-cpu 19075M  ./single-gpu-program &
srun --exact -n1 -c 16 -G1 --mem-per-cpu 19075M  ./single-gpu-program &
wait

More explanation for the above example can be found here.

Requesting GPUs

Each Grete node consists of 4 A100 GPUs with either 40 GiB or 80GiB of GPU memory (VRAM), or 8 A100 GPUs with 80 GiB of VRAM (only in grete:shared). In these partitions, you can only request whole GPUs.

On non-shared partitions (e.g. grete, grete-h100, …) you automatically block the whole node, getting all 4 GPUs on each node. This is suited best for very large jobs. There are no nodes with 8 GPUs in this partition.
Other GPU partitions are shared, you can choose how many GPUs you need. Requesting less than four (or eight) GPUs means more than one job can run on a node simultaneously (particularly useful for job arrays).

It is possible to request less or more system memory than the default. If you wanted 20GiB of RAM, you could use the additional Slurm argument --mem=20G. We recommend to not explicitly request memory or CPU cores at all, in most cases Slurm will assign an appropriate amount, proportional to the number of GPUs you reserved.

Warning

Make sure to be fair and to not use more than your proportional share of system memory or CPU cores!

For example, if you only need 1 GPU, you should only use up to a fourth of usable memory (see below for details). See the tables on this page for the hardware specifications of nodes in each partition. If you request too much, there are not enough resources left to schedule other jobs, even though there are unsused GPUs in a node (unless someone goes out of their way to explicitly request less than the default share of CPU cores or memory).

It is important to note that you can’t just divide the value in the RAM per node column by 4! A few GiBs of memory are reserved by the hardware for book keeping, error correction and such, so of 512GiB, the operating system (OS) usually only sees 503GiB for example. The OS itself needs some memory to function correctly, so Slurm usually reserves around 20GiB for that. This means that the maximum amount of memory actually usable for jobs is only ~480GiB for a 512GiB RAM node.

The example script below requests two A100 GPUs:

#!/bin/bash

#SBATCH --job-name=train-nn-gpu
#SBATCH -t 05:00:00                  # Estimated time, adapt to your needs
#SBATCH --mail-type=all              # Send mail when job begins and ends

#SBATCH -p grete:shared              # The partition
#SBATCH -G A100:2                    # Request 2 GPUs

module load miniforge3
module load gcc
module load cuda
source activate dl-gpu               # Replace with the name of your miniforge/conda environment

# Print out some info.
echo "Submitting job with sbatch from directory: ${SLURM_SUBMIT_DIR}"
echo "Home directory: ${HOME}"
echo "Working directory: $PWD"
echo "Current node: ${SLURM_NODELIST}"

# For debugging purposes.
python --version
python -m torch.utils.collect_env
nvcc -V

# Run the script:
python -u train.py

Interactive Usage and GPU Slices on grete:interactive

Whole A100 GPUs are powerful, but you might not need all the power. For instance, debugging a script until it runs might require a GPU, but not the full power of an A100. To this end, the grete:interactive partition is provided. The idea is that you can come into the office, log into the cluster, and use this for development and testing. When you are confident your script works, you should submit larger workloads as a “hands-off” batch job, making use of one or more full GPUs (multiples of 4 for grete).

The interactive partition contains A100 GPUs that are split into slices with the NVIDIA Multi-Instance-GPU (MIG) technology. Each GPU slice consists of a computation part (a subset of the streaming multiprocessors) and some memory.

An A100 for example can be split into 7 compute units comprised of 14 streaming multiprocessors (SMs) each. When using MIG, 10 of the 108 SMs are used for management and are not available for compute. For example, we currently split the 80GiB A100s into six 1g.10gb slices and one 1g.20gb slice, which means that each split has one compute unit (“1g”) and 10 GiB or 20 GiB of GPU memory (VRAM). For an interactive node that has 4 such A100 GPUs, this means there are in total 28 slices per node.

The configuration of MIG slices might be subject to change, depending on the load of the cluster and requirements reported to us by our users. Use scontrol show node <nodename> to see the type and number of slices a node currently offers. MIG slices are configured by an administrator and can not be changed by job scripts, you have to pick one of the sizes on offer.

Instead of requesting a whole GPU with e.g. -G A100:n, as you would in the other GPU partitions, you request MIG slices in the same format Nvidia uses, i.e. -G 1g.10gb:n. The :n for the count of slices at the end is optional, if you don’t specify it, you will get one such slice. At the moment, that is also the maximum allowed for grete:interactive. The following interactive Slurm example requests a 1g.10gb slice:

srun --pty -p grete:interactive -G 1g.10gb /bin/bash

Monitoring

Once you submitted a job, you can check its status and where it runs with

squeue --me

In Your Script

Many packages that use GPUs provide a way to check the resources that are available. For example, if you do deep learning using PyTorch, you can always check that the correct number of resources are available with torch.cuda.is_available().

Note that in PyTorch, the command torch.cuda.get_device_name() does not work with Nvidia-MIG splited GPUs and will give you an error.

Using nvitop

To monitor your resource usage on the node itself, you can use nvitop.

Check where your job runs with squeue --me
Log into the node. For instance, if your job runs on ggpu146, use ssh ggpu146
On the node, run
```
module load py-nvitop
nvitop
```

In this example output, you can see your

GPU compute usage/utilization (top) as UTL
GPU memory usage (top) as MEM
your CPU usage (bottom) with your abbreviated user name.

Software and Libraries

Cuda Libraries

To load CUDA, do

module load gcc/VERSION cuda/VERSION

If you don’t specify the VERSION, the defaults (12.6.2 for cuda) will be used.

Note

Due to the hierachical nature of the module system, you can only load the cuda module after the corresponding version of the compiler, in this case gcc, has been loaded. Use module spider cuda to see all available versions, followed by module spider cuda/VERSION to see which gcc version has to be loaded before a given cuda version.

Nvidia HPC SDK (Nvidia compiler)

The full Nvidia HPC SDK 23.3 is installed and useable via the modules nvhpc/23.3 nvhpc-byo-compiler/23.3 nvhpc-hpcx/23.3 nvhpc-nompi/23.3.

This SDK includes the Nvidia compiler and the HPC-X OpenMPI and several base libraries for CUDA based GPU acceleration.

Using MPI and other communication libraries

We have several CUDA enabled OpenMPI versions available. We have the HPC-X OpenMPI in the module nvhpc-hpcx/23.3, the regular OpenMPI 3.1.5 from the Nvidia HPC SDK in nvhpc/23.3 and the OpenMPI included in the Nvidia/Mellanox OFED stack, provided by the module openmpi-mofed/4.1.5a1.

Additional the libraries NCCL 12.0, NVSHMEM 12.0 and OpenMPI 4.0.5 are available as part of the Nvidia HPC SDK in /sw/compiler/nvidia/hpc_sdk/Linux_x86_64/23.3/comm_libs/.

Singularity/Apptainer Containers

Apptainer (see Apptainer (formerly Singularity)) supports “injecting” NVIDIA drivers plus the essential libraries and exectuables into running containers. When you run SIF containers, pass the --nv option. An example running the container FOO.sif would be:

#!/bin/bash
#SBATCH -p grete:shared              # the partition
#SBATCH -G A100:1                    # For requesting 1 GPU.
#SBATCH -c 4                         # Requestion 4 CPU cores.

module load apptainer/VERSION
module load gcc/GCCVERSION
module load cuda/CUDAVERSION

apptainer exec --nv --bind /scratch FOO.sif

where VERSION is the desired version of Apptainer, GCCVERSION the necessary prerequisite gcc version for your chosen version of cuda and CUDAVERSION is the specific version of CUDA desired.

For more information, see the Apptainer GPU documentation.

Using Conda/Mamba/Miniforge

Conda (replaced on our clusters by miniforge3) and Mamba are package managers that can make installing and working with various GPU related packages a lot easier. See Python for information on how to set them up.

Training Resources (Step-by-Step)

We have regular courses on GPU usage and deep learning workflows. Our materials are online for self-studying.

A deep learning example for training a neural network on Grete: https://gitlab-ce.gwdg.de/dmuelle3/deep-learning-with-gpu-cores
Example GPU jobs are being collected at https://gitlab-ce.gwdg.de/gwdg/hpc-usage-examples/-/tree/main/gpu

How scheduling works

General

cf: https://slurm.schedmd.com/priority_multifactor.html

On the GWDG HPC cluster, we use the Slurm multifactor-priority plugin to calculate job priorities for job scheduling, that most HPC clusters use. The alternative that Slurm offers is the so-called “priority/basic” plugin, which does simple FIFO scheduling.

However, the priority is not the most important factor by which the scheduler decides on the next job to look at, instead, it considers the following order:

Jobs that can preempt other jobs
- Jobs that can preempt other jobs are considered for scheduling at the very first. Job preemption means that certain jobs can be killed, if other jobs need their resources. This is not relevant for us, job preemption is disabled on the GWDG HPC cluster.
Jobs with a reservation
- Jobs in a reservation are scheduled before all others, since their resources can’t be used by other jobs anyway.
Jobs on partitions with a higher priority tier
- If the partition of the job has a higher priority tier, the jobs is considered before others. On the GWDG HPC cluster, the *:test partitions have a priority tier of 100 and the partition large96 has a priority tier of 20 and all other partitions have a priority tier of 1, thus, jobs submitted to those partitions are considered before all others.
Job priority
- At this point, the next job considered is the one with the highest priority (see below). Crucially, this can mean that the job with the highest priority is not necessarily the next job considered for scheduling, if jobs out of (1-3) are present.
Job submit time
- If two jobs have the same priority, their submit time is used to order them. The older job will be considered first.
Job ID
- If two jobs have the same priority and the exactly same submit time, Slurm orders them via their job id, which is guaranteed to be unique. The lower job id will be considered first.

The three scheduling loops:

cf: https://slurm.schedmd.com/sched_config.html

Slurm has three individual scheduling loops:

Direct scheduling: Runs on job submission, only until default_queue_depth=500
- Only runs for srun/salloc jobs (defer_batch)
Main scheduling loop: Runs periodically, simply orders jobs and schedules until the first “pending”, with partition_job_depth=100
Backfill scheduling loop: More comprehensive, able to fill the gaps in between larger jobs, will run periodically as well (with bf_interval=45)

Since Slurm’s scheduling and its loops are very resource intensive and establish mutexes at various points in the code, Slurm is unable to answer pending RPCs (“Remote-Procedure-Calls” to the Slurm controller, such as sinfo / squeue, job step creation, etc.) during the scheduling. To keep the system responsive, the scheduling loops have periodic time-outs.

Main scheduling loop: (max_sched_time) 8 seconds
Backfill scheduling: (bf_max_time) 75 seconds

Furthermore, the scheduling loops are interrupted if too many RPCs are pending, max_rpc_cnt=50

Since backfill scheduling and, in general, multifactor scheduling are NP-hard problems, the Slurm scheduler can only approximate a solution. To avoid running indefinitely, the backfill scheduler uses several limits in each iteration:

Max runtime of backfill scheduler: (bf_max_time) 75 seconds
How long the backfill scheduler plans into the future: (bf_window) 2880 minutes
How many jobs the backfill scheduler considers at all (bf_max_job_test): 2000
How many jobs the backfill scheduler considers per user: (bf_max_job_user): 15
How many jobs the backfill scheduler considers per partition: (bf_max_job_part): 200

Priorities

cf: https://slurm.schedmd.com/priority_multifactor.html

The multifactor-priority plugin uses multiple “factors” to construct a “priority” for a job (hence the name), which, in turn, becomes a factor by which the scheduler selects the next job to consider for scheduling. The factors are:

Age:
- Time that the job already waited in the queue. Older jobs get a higher value here. Crucially, jobs reach the maximum age priority after 5 days and stop to accrue age priority after that. Jobs that are on hold, unable to run due to limits or waiting for a dependency also don’t accrue age priority in that time.
Association:
- Associations (i.e. users) can have their own priority factor. This is disabled on the GWDG HPC cluster.
Fair-Share:
- A calculated value, dependent on the jobs a user/account has already submitted compared to other users in the tree around them (see below). Users/Accounts with many previous jobs get a lower value here.
Job size:
- The number of nodes a job will allocate. Larger jobs get a higher value here.
Nice:
- Users can give their jobs an arbitrary (positive) nice value on submit, which will be subtracted from their priority. The higher the value, the lower the priority (the “nicer” one is to other users). Defaults to 0.
Partition:
- Partitions can have a priority factor. On the GWDG HPC cluster, the *:test partitions have a factor of 10 here, all other partitions a factor of 1.
QOS:
- The QOS a job is submitted with can have a priority factor. On the GWDG HPC cluster, the interactive QOS has a factor of 1000, the 2h QOS a factor of 100. All other QOS have a factor of 0.
Site:
- A global site priority factor can be configured to be calculated via a script for each job. This is disabled on the GWDG HPC cluster.
TRES:
- Different TRES (Trackable RESources, i.e. CPUs, RAM, GPUs, etc) can have different priority factors configured. This is disabled on the GWDG HPC cluster.

The different priority factors are each normalized to the highest available value of the factor (thus becoming a number between 0.0 and 1.0) and then multiplied with their corresponding weights:

PriorityWeightJobSize: 1,000
PriorityWeightAge: 10,000
PriorityWeightQOS: 10,000
PriorityWeightFairshare: 100,000
PriorityWeightPartition: 1,000,000

The priority factors of a job, multiplied with their corresponding weights, are the summed up, the nice value is subtracted, and the result becomes the job’s priority. Since some of the underlying values change over time, such as Age or Fairshare, this calculation is periodically updated. One can view the job’s priority, as well as the individual factors, with the command sprio.

Fairshare

cf: https://slurm.schedmd.com/fair_tree.html

Fairshare is a mechanism that tries to prioritise jobs of users which have, so far, been under-served by the machine. For that, it calculates a fairshare value between 0.0 and 1.0 for each job, which then goes into the priority calculation of the job (see above).

On the GWDG HPC cluster cluster, the “Fair-Tree” algorithm is used. This algorithm ensures that if an entire tree branch is underserved to a different branch (i.e. NHR users compared to SCC users), all entries of that tree branch get a higher priority.

In the “Fair-Tree” algorithm, the Level FS value is calculated for each user association, based on the usage that this association has compared to its siblings in the tree. The Level FS value is then used to rank all associations in a list, with the highest Level FS value first. The highest-ranked association will receive a Fairshare value of 1.0, and all other associations a Fairshare value of their rank, divided by the total number of associations.

To calculate the Level FS value for a association, the formula is LF = S / U, where S is the normalize share value (i.e. the shares of the current association divided by all the shares of the current association and its siblings in the tree) and U is the normalized usage value (i.e. the usage of the current association divided by sum of the usage of the current associations and its siblings in the tree).

For more information on the algorithm, see https://slurm.schedmd.com/fair_tree.html

One can view the Level FS value, the shares (NormShares U and RawShares S) and the Fairshare value with the command sshare -l.

Interactive Jobs

What is an interactive job? Why use interactive jobs?

An interactive job requests resources from a partition, and immediately opens a session on the assigned nodes so you can work interactively. This is done usually on specially designated interactive or test partitions, that have low to no wait times (but also usually low maximum resource allocation and short maximum sessions), so your session can start immediately. This can also be attempted on normal partitions, but of course you would have to be present at the terminal when your job actually starts.

There are multiple use cases for interactive jobs:

Performing trial runs of a program that should not be done on a log-in node. Remember login nodes are in principle just for logging in!
Testing a new setup, for example a new Conda configuration or a Snakemake workflow, in a realistic node environment. This prevents you from wasting time in a proper partition with waiting times, just for your job to fail due to the wrong packages being loaded.
Testing a new submission script or SLURM configuration.
Running heavy installation or compilation jobs or Apptainer container builds.
Running small jobs that don’t have large resource requirements, thus reducing waiting time.
Doing quick benchmarks to determine the best partition to run further computations in (e.g. is the code just as fast on Emmy Phase 2 nodes as Emmy Phase 3)
Testing resource allocation, which can sometimes be tricky particularly for GPUs. Start up a GPU interactive job, and test if your system can see the number and type of GPUs you expected with nvidia-smi. Same for other resources such as CPU count and RAM memory allocation. Do remember interactive partitions usually have low resource maximums or use older hardware, so this testing is not perfect!
Running the rare interactive-only tools and programs.

Tip

The Jupyter-HPC service is also provided for full graphical interactive JupyterHub, RStudio, IDE, and Desktop sessions.

How to start an interactive job

To start a (proper) interactive job:

srun -p jupyter --pty -n 1 -c 16 bash

This will block your terminal while the job starts, which should be within a few minutes. If for some reason this is taking too long or it returns a message that the request is denied (due to using the wrong partition or exceeding resource allocations for example), you can break the request with Ctrl-c.

In the above command:

starts a job in the partition designated after -p
--pty runs in pseudo terminal mode (critical for interactive shells)
then the usual SLURM resource allocation options
finally we ask it to start a session in bash

Tip

Don’t forget to specify the time limit with -t LIMIT if the job will be short so that it is more likely to start earlier, where LIMIT can be in the form of MINUTES, HOURS:MINUTES:SECONDS, DAYS-HOURS, etc. (see the srun man page for all available formats). This is especially important on partitions not specialized for interactive and test jobs. If your time limit is less than or equal to 2 hours, you can also add --qos=2h to use the 2 hour QOS to further reduce the likely wait time.

Info

If you want to run a GUI application in the interactive job, you will almost certainly need X11 forwarding. To do that, you must have SSH-ed into the login node with X11 forwarding and add the --x11 option to the srun command for starting the interactive job. This then forwards X11 from the interactive job all the way to your machine via the login node.

Though, in some cases, it might make more sense to use the Jupyter-HPC service instead.

You should see something like the following after your command, notice how the command line prompt (if you haven’t played around with this) changes after the job starts up and logs you into the node:

u12345@glogin5 ~ $ srun -p standard96:test --pty -n 1 -c 16 bash
srun: job 6892631 queued and waiting for resources
srun: job 6892631 has been allocated resources
u12345@gcn2020 ~ $

To stop an interactive session and return to the login node:

exit

Which partitions are interactive?

Any partition can be used interactively if it is empty enough, but some are specialized for it with shorter wait times and thus better suited for interactive jobs. These interactive partitions can change as new partitions are added or retired. Check the list of partitions for the most current information. Partitions whose names match the following are specialized for shorter wait times:

*:interactive
*:test which have shorter maximum job times
jupyter* which is shared with the Jupyter-HPC service and are overprovisioned (e.g. your job may share cores with other jobs)

You can also just look for other partitions with nodes in the idle state (or mixed nodes if doing a job that doens’t require a full node on a shared partition) with sinfo -p PARTITION. For example, if we check the scc-gpu partition:

[scc_agc_test_accounts] u12283@glogin6 ~ $ sinfo -p scc-gpu
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
scc-gpu      up 2-00:00:00      1  inval ggpu194
scc-gpu      up 2-00:00:00      3   mix- ggpu[135,138,237]
scc-gpu      up 2-00:00:00      1   plnd ggpu145
scc-gpu      up 2-00:00:00      1  down* ggpu150
scc-gpu      up 2-00:00:00      1   comp ggpu199
scc-gpu      up 2-00:00:00      1  drain ggpu152
scc-gpu      up 2-00:00:00      1   resv ggpu140
scc-gpu      up 2-00:00:00     11    mix ggpu[139,141,147-149,153-155,195-196,212]
scc-gpu      up 2-00:00:00      6  alloc ggpu[136,142-144,146,211]
scc-gpu      up 2-00:00:00      4   idle ggpu[151,156,197-198]

we can see that there are 4 idle nodes and 11 mixed nodes. This means that an interactive job using a single node should start rather quickly, particularly if it only requires part of a node since then one of the mixed nodes might be able to run it too.

Pseudo-interactive jobs

If you have a job currently running on a given node, you can actually SSH into that node. This can be useful in some cases to debug and check on your program and workflows. For example, you can check on the live GPU load with nvidia-smi or monitor the CPU processes and the host memory allocation with btop. Some of these checks are easier and more informative when performed live rather than using after-job reports such as the job output files or sacct.

u12345@glogin5 ~ $ squeue --me
  JOBID    PARTITION         NAME     USER  ACCOUNT     STATE       TIME  NODES NODELIST(REASON)
6892631   standard96         bash   u12345  myaccount   RUNNING     11:33     1 gcn2020
u12345@glogin5 ~ $ ssh gcn2020
u12345@gcn2020 ~ $

If you try this on a node you don’t currently have a job in it will fail, since its resources have not been allocated to your user!

GPU Interactive Jobs

See: GPU Usage.

Interactive Jobs with Internet Access

See: Internet access within jobs.

Job arrays

Job arrays are the preferred way to submit many similar jobs, for instance, if you need to run the same program on several input files, or run it repeatedly with different settings or parameters. This type of parallelism is usually called “embarrassingly parallel” or trivially parallel jobs.

Arrays are created with the -a start-finish sbatch parameter. E.g. sbatch -a 0-19 will create 20 jobs indexed from 0 to 19. There are different ways to index the arrays, which are described below.

Job Array Indexing, Stepsize and more

Slurm supports a number of ways to set up the indexing in job arrays.

Range: -a 0-5
Multiple values: -a 1,5,12
Step size: -a 0-5:2 (same as -a 0,2,4)
Combined: -a 0-5:2,20 (same as -a 0,2,4,20)

Additionally, you can limit the number of simultaneously running jobs with the %x parameter in there:

-a 0-11%4 only four jobs at once
-a 0-11%1 run all jobs sequentially
-a 0-5:2,20%2 everything combined. Run IDs 0,2,4,20, but only two at a time.

You can read everything on array indexing in the sbatch man page.

Slurm Array Environment Variables

The behavior of your applications inside array jobs can be tied to Slurm environment variables, e.g. to tell the program which part of the array they should process. These variables will have different values for each job in the array. Probably the most commonly used of these environment variables is $SLURM_ARRAY_TASK_ID, which holds the index of the current job in the array. Other useful variables are:

SLURM_ARRAY_TASK_COUNT: Total number of tasks in a array.
SLURM_ARRAY_TASK_MAX: Job array’s maximum ID (index) number.
SLURM_ARRAY_TASK_MIN: Job array’s minimum ID (index) number.
SLURM_ARRAY_TASK_STEP: Job array’s index step size.
SLURM_ARRAY_JOB_ID: Job array’s master job ID number.

Example job array

The most simple example for using a job array is running a loop in parallel.

Job array running loop in parallel

#!/bin/bash
#SBATCH -p medium
#SBATCH -t 10:00
#SBATCH -n 1
#SBATCH -c 4

module load python
for i in {1..100}; do
  python myprogram.py $i
done

#!/bin/bash
#SBATCH -p medium
#SBATCH -t 10:00
#SBATCH -n 1
#SBATCH -c 4
#SBATCH -a 1-100

module load python
python myprogram.py $SLURM_ARRAY_TASK_ID

The loop in the first example runs on the same node and in serial. More efficiently, the job array in the second tab unrolls the loop and if enough resources are available, runs all of the 100 jobs in parallel.

Example job array running over files

This is an example of a job array, creates a job for every file ending in .inp in the current working directory:

#!/bin/bash
#SBATCH -p medium
#SBATCH -t 01:00
#SBATCH -a 0-X
# insert X as the number of .inp files you have -1 (since bash arrays start counting from 0)
# ls *.inp | wc -l
 
#for safety reasons
shopt -s nullglob
#create a bash array with all files
arr=(./*.inp)
 
#put your command here. This just runs the fictional "big_computation" program with one of the files as input
./big_computation ${arr[$SLURM_ARRAY_TASK_ID]}

In this case, you have to get the number of files beforehand (fill in the X). You can also do that automatically by removing the #SBATCH -a line and adding that information on the command line when submitting the job:

sbatch -a 0-$(($(ls ./*.inp | wc -l)-1)) jobarray.sh

The part in parentheses just uses ls to output all .inp files, counts them with wc and subtracts 1, since bash arrays start counting at 0.

Job runtimes and QoS

Very short jobs

In case you have jobs that are shorter than 2 hours, there is the option to set the QoS 2h, which will drastically increase your priority. This option can be set via --qos=2h or in a batch script via '#SBATCH --qos=2h.

Please note that you need to specify a time limit that is at most 2 hours --time=2:00:00 as well.

Job Walltime

The maximum runtime is set per partition and can be seen either on the system with sinfo or here. There is no minimum walltime (we cannot stop your jobs from finishing, obviously), but a walltime of at least 1 hour is strongly recommended. Our system is optimized for high performance, not high throughput. A large amount of smaller, shorter jobs induces a lot of overhead as each job has a prolog, setting up the enviroment, an epilog for cleaning up and bookkeeping which can put a lot of load on the scheduler. The occasional short job is fine, but if you submit larger amounts of jobs that finish (or crash) quickly, we might have to intervene and temporarily suspend your account. If you have lots of smaller workloads, please consider combining them into a single job that runs for at least 1 hour. A tool often recommended to help with issues like this (among other useful features) is Jobber.

Jobs that need more than 12 hours of runtime

The default runtime limit on most partitions is 12 hours. On systems where your compute time is limited (you can check with the sbalance command), we will only refund jobs that run up to 12 hours, beyond that you are running the application at your own risk. But you can use the --time option in your sbatch script to request up to 48 hours of runtime.

Jobs that need more than 48 hours of runtime

Most compute partitions have a maximum wall time of 48 hours. Under exceptional circumstances, it is possible to get a time extension for individual jobs (past 48 hours) by writing a ticket during normal business hours. To apply for an extension, please write a support request, containing the Job ID, the username, the project and the reason why the extension is necessary. Alternatively - under even more exceptional circumstances and also via mail request, including username, project ID and reason - permanent access to Slurm Quality-Of-Service (QoS) levels can be granted, which permit a longer runtime for jobs (up to 96 hours) but have additional restrictions regarding job size (e.g. number of nodes).

Please make sure you have checked out all of the following suggestions before you request access to the 96h qos:

Can you increase the parallelization of your application to reduce the runtime?
Can you use checkpointing to periodically write out the intermediate state of your calculation and restart from that in a subsequent job?
Are you running the code on the fastest available nodes?
Is it possible to use GPU acceleration to speed up your calculation?
Are you using the latest software revision (in contrast to e.g. using hlrn-tmod or rev/xx.yy)?

Info

We recommend permanent access to the long running QoS only as a last resort. We do not guarantee to refund your NPL on the long running QoS if something fails. Before, you should exploit all possibilities to parallelize/speed up your code or make it restartable (see below).

Dependent & Restartable Jobs - How to pass the wall time limit

If your simulation is restartable, it might be handy to automatically trigger a follow-up job. Simply provide the ID of the previous job as an additional sbatch argument:

# submit a first job, extract job id
jobid=$(sbatch --parsable job1.sbatch)
 
# submit a second job with dependency: starts only if the previous job terminates successfully)
jobid=$(sbatch --parsable --dependency=afterok:$jobid job2.sbatch)
 
# submit a third job with dependency: starts only if the previous job terminates successfully)
jobid=$(sbatch --parsable --dependency=afterok:$jobid job3.sbatch)

Note

As soon as a follow-up jobscript (sbatch file) is submitted, you can not change its content any more. Lines starting with #SBATCH will be evaluated immediately. The remaining content of the jobscript is evaluated as soon as the dependency condition is fulfilled and compute nodes are available. Besides afterok there exist other dependency types (sbatch man page).

Multiple concurrent programs on a single node

Using srun to create multiple jobs steps

You can use srun to start multiple job steps concurrently on a single node, e.g. if your job is not big enough to fill a whole node. There are a few details to follow:

By default, the srun command gets exclusive access to all resources of the job allocation and uses all tasks.
- Therefore, you need to limit srun to only use part of the allocation.
- This includes implicitly granted resources, i.e. memory and GPUs.
- The –exact flag is needed.
- If running non-mpi programs, use the -c option to denote the number of cores, each process should have access to.
srun waits for the program to finish, so you need to start concurrent processes in the background.
Good default memory per cpu values (without hyperthreading) are usually are:

	standard96	large96	huge06	medium40	large40/gpu
–mem-per-cpu	3770M	7781M	15854M	4525M	19075M

Examples

#!/bin/bash
#SBATCH -p standard96
#SBATCH -t 06:00:00
#SBATCH -N 1
 
srun --exact -n1 -c 10 --mem-per-cpu 3770M  ./program1 &
srun --exact -n1 -c 80 --mem-per-cpu 3770M  ./program2 &
srun --exact -n1 -c 6 --mem-per-cpu 3770M  ./program3 &
wait

#!/bin/bash
#SBATCH -p gpu
#SBATCH -t 12:00:00
#SBATCH -N 1
 
srun --exact -n1 -c 10 -G1 --mem-per-cpu 19075M  ./single-gpu-program &
srun --exact -n1 -c 10 -G1 --mem-per-cpu 19075M  ./single-gpu-program &
srun --exact -n1 -c 10 -G1 --mem-per-cpu 19075M  ./single-gpu-program &
srun --exact -n1 -c 10 -G1 --mem-per-cpu 19075M  ./single-gpu-program &
wait

Using the Linux parallel command to run a large number of tasks

If you have to run many nearly identical but small tasks (single-core, little memory) you can try to use the Linux parallel command. To use this approach you first need to write a bash-shell script, e.g. task.sh, which executes a single task. As an example we will use the following script:

#!/bin/bash
 
# parallel task
TASK_ID=$1
PARAMETER=$((10+RANDOM%10))    # determine some parameter unique for this task
                               # often this will depend on the TASK_ID
 
echo -n "Task $TASK_ID: sleeping for $PARAMETER seconds ... "
sleep $PARAMETER
echo "done"

This script is simply defining a variable PARAMETER which then used as the input for the actual command, which is sleep in this case. The script also takes one input parameter, which can be interpreted as the TASK_ID and could also be used for determining the PARAMETER. If we make the script executable and run it as follows, we get:

$ chmod u+x task.sh
$ ./task.sh 4
Task 4: sleeping for 11 seconds ... done

To now run this task this task 100 times with different TASK_IDs we can write the following job script:3

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#!/bin/bash
 
#SBATCH --partition medium40:test        # adjust partition as needed
#SBATCH --nodes 1                        # more than 1 node can be used
#SBATCH --tasks-per-node 40              # one task per CPU core, adjust for partition
 
# set memory available per core
MEM_PER_CORE=4525    # must be set to value that corresponds with partition
 
# Define srun arguments:
srun="srun -n1 -N1 --exclusive --mem-per-cpu $MEM_PER_CORE"
# --exclusive     ensures srun uses distinct CPUs for each job step
# -N1 -n1         allocates a single core to each task
 
# Define parallel arguments:
parallel="parallel -N 1 --delay .2 -j $SLURM_NTASKS --joblog parallel_job.log"
# -N                number of argument you want to pass to task script
# -j                number of parallel tasks (determined from resources provided by Slurm)
# --delay .2        prevents overloading the controlling node on short jobs
# --resume          add if needed to use joblog to continue an interrupted run 
#                   (job resubmitted)
# --joblog          creates a log-file, required for resuming
 
# Run the tasks in parallel
$parallel "$srun ./task.sh {1}" ::: {1..100}
# task.sh          executable(!) script with the task to complete, may depend on some input 
#                  parameter  
# ::: {a..b}       range of parameters, alternatively $(seq 100) should also work  
# {1}              parameter from range is passed here, multiple parameters can be used 
#                  with additional {i}, e.g. {2} {3} (refer to parallel documentation)  

The script use parallel in line 25 to run task.sh 100 times with a parameter taken from the range {1..100}. Because each task is started with srun a separate job step is created and the options used with srun (see line 12) the task is using only a single core. This simple example can be adjusted as needed by modifying the script task.sh and the job script parallel_job.sh. You can adjust the requested resources, for example, you can use more than a single node. Note that depending on the number of tasks you may have to split your job into several to keep the total time needed short enough. Once the setup is done, you can simply submit the job:

$ sbatch parallel_job.sh

Looping over two arrays

You can use parallel to loop over multiple arrays. The –xapply option controls, if all permuatations are used or not:

$ parallel --xapply echo {1} {2} ::: 1 2 3 ::: a b c
1 a
2 b
3 c
$ parallel echo {1} {2} ::: 1 2 3 ::: a b c
1 a
1 b
1 c
2 a
2 b
2 c
3 a
3 b
3 c

Resource monitoring and Reports

When debugging and optimizing your application, it is important to know what is actually happening on the compute node. Most information has to be collected during the runtime of the job, but a few key metrics are still available after a job has finished.

During Runtime

`sstat`

During a job’s runtime, you can use the sstat command to get live information about its performance, regarding CPU, Tasks, Nodes, Resident Set Size (RSS) and Virtual Memory (VM) for each step.

To use, call sstat -a <jobid> to list all jobsteps of a single job. Due to amount of fields displayed, we recommend appending a | less -S to the command to enable side-scrolling, or to reduce the amount of displayed fields with the -o flag. (Important note: When forgetting the -a flag, sstat may display an empty dataset when the job did not start explicit jobsteps).

You can find all available fields and their meaning, as well as further information about the command in its man page (man sstat) or on its website https://slurm.schedmd.com/sstat.html.

SSH

While a job is running, you can also use ssh to get onto the node(s) allocated to your job to view its performance directly. Use squeue --me to see which nodes your job is running on. Once logged in to the compute node, you can take a look at the resource usage with standard Linux commands, such as htop, ps or free. Please keep in mind that most commands will show ALL resources of the node, not just those allocated to you.

After the Job finished / Reports

`sacct`

sacct is a Slurm tool to query the Slurm database for information about jobs. The returned information range from simple status information (current runstate, submission time, nodes, etc.) to gathered performance metrics.

To use, call sacct -j <jobid>, which will display a small subset of the available job information (Please keep in mind that database operations are async, thus recently started jobs might not yet be in the database and thus are not available in sacct).

To get more than the basic job information immediately available, one can use the --long flag, which will print many more fields per job. Due to the large number of fields and thus hard to read output, we recommend appending a | less -S to the output to enable side-scrolling in the output. Further fields can be manually selected using the --format flag, where the command sacct --helpformat lists all available fields.

As a form of special field, the flag --batch-script will print the slurm script of the job if the job was submitted with sbatch and the flag --env-vars will print the list of environment variables of a job, usefull for debugging. These two flags cannot be combined with others, expect the -j flag.

You can find all available fields and flags, as well as further information about the command in its man page (man sacct) or on its website https://slurm.schedmd.com/sacct.html.

`reportseff`

To get resource efficiency information about your job after it finished, you can use the tool reportseff. This tools queries Slurm to get your allocated resources and compares it to the actually used resources (as reported by Slurm). You can get all information reportseff uses to create reports by manually using sacct, it just collates them in a nice fashion.

This can give you a great overview about your usage: Did I make use of all my cores and all my memory? Was my time limit too long?

Usage example:

# Display your recent jobs
gwdu101:121 14:00:00 ~ > module load py-reportseff
gwdu101:121 14:00:05 ~ > reportseff -u $USER
     JobID    State       Elapsed  TimeEff   CPUEff   MemEff 
 
  12671730  COMPLETED    00:00:01   0.0%      ---      0.0%  
  12671731  COMPLETED    00:00:00   0.0%      ---      0.0%  
  12701482  CANCELLED    00:04:20   7.2%     49.6%     0.0%
 
 
# Give specific Job ID:
gwdu102:29 14:07:17 ~ > reportseff 12701482
     JobID    State       Elapsed  TimeEff   CPUEff   MemEff 
  12701482  CANCELLED    00:04:20   7.2%     49.6%     0.0%

As you can see in the example, the job only took 4:20 minutes out of 1h allocated time, resulting in a TimeEfficiency of 7.2%. Only half the allocated cores (two allocated and only one used) and basically none of the allocated memory were used. For the next similar job, we should reduce the time limit, request one core less and definitely request less memory.

Self-Resubmitting jobs

Do you find yourself in the situation that your jobs need more time than allowed? Do you regularly write tickets to lengthen your job times or wait longer because of using QOS=long? Self-Resubmitting jobs might be a solution for you!

The requirement is that the program you are running is able to or can be updated so that is produces checkpoints and is able to restart from any checkpoint after a forced stop. Many programs already have check pointing options, like Gromacs or OpenFoam. Turn those on and update your batch script to resubmit itself to continue running.

Note

It is very important in general to have your jobs do check pointing. No one can promise 100% availability of a node, and every job that stops due to a failed node is wasted energy. We strive to have a very high energy efficiency and urge every user to be able to recover from a failed job without an issue. Even short jobs that fail are a waste of energy if the run needs to be repeated!

Two different types of examples are shown below that run a python script called big-computation.py, which in the first example is creating a checkpoint every 10 minutes and can restart from the last checkpoint file. The second example continuously writes to an output file which can be copied into a checkpoint, which in turn can be used at the restart of the computation.

#!/bin/bash
#SBATCH -p standard96
#SBATCH -t 12:00:00
#SBATCH -N 1
#SBATCH -c 4
#SBATCH -o outoput_%j.txt
#SBATCH -e error_%j.txt

if [ ! -f "finished" ] ; then
	sbatch --dependency=afterany:$SLURM_JOBID resub_job.sh
else
	exit 0
fi

# big-computation creates automatic checkpoints
# and automatically starts from the most recent 
# checkpoint
srun ./big-computation.py --checkpoint-time=10min --checkpoint-file=checkpoint.dat input.dat

# If big-computation.py is not canceled due to time out
# write the finished file to stop the loop.
touch finished

#!/bin/bash
#SBATCH -p standard96
#SBATCH -t 12:00:00
#SBATCH -N 1
#SBATCH -c 4
#SBATCH -o outoput_%j.txt
#SBATCH -e error_%j.txt
#SBATCH --signal=B:10@300
# Send signal 10 (SIG_USR1) 5min before time limit

trap SIGUSR1 'pkill -f big-computation.py -15 ; cp output.dat checkpoint.dat ; exit 1'

JOBID=`sbatch --dependency=afternotok:$SLURM_JOBID resub_job_trap.sh`

if [ -f "checkpoint.dat" ] ; then
	INPUT=checkpoint.dat
else
	INPUT=input.dat
fi

# big-computation creates automatic checkpoints
# and automatically starts from the most recent 
# checkpoint
srun ./big-computation.py $INPUT

scancel $JOBID
exit 0

Both scripts rely on the --dependency flag from the sbatch command. The first example runs until the scheduler kills the process. The next job starts after this one has finished, either by being killed or finishing. Only when the computation did successfully finish, a file called finished will be written, breaking the loop.

The first script will start and wait for one more job once the computation has finished because it always checks for the finished file.

The second script traps a signal to stop the computation. Observe the option #SBATCH --signal=B:10@300, which tells the scheduler to send the signal SIGUSER1(10) 5 minutes before the job is killed due to the time limit. The command trap SIGUSER1 captures this signal, stops the program with the SIGINT(15) signal, copies the output file into a checkpoint from which the computation can resume, and exits the script with code 1 (an error code). The resubmitting sbatch command handles the dependency to start the script only after the last job exited with an error code. Once started, the program either uses the input file or the last checkpoint file as the input, and the script only exits successfully if the computations finishes.

The second script will keep the last submitted job in the queue even though the last job finished successfully. It will be a pending job with reason DependencyNeverSatisfied and this job needs to be canceled manually using the scancel command. Therefore, the job script saves the submitted jobid and cancels it directly once the program exits normally.

Program specific examples

These examples take the ideas explored above and apply it to specific programs.

#!/bin/bash
#SBATCH -p standard96
#SBATCH -t 12:00:00
#SBATCH -N 1
#SBATCH -c 4
#SBATCH -o outoput_%j.txt
#SBATCH -e error_%j.txt

JOBID=`sbatch --dependency=afternotok:$SLURM_JOBID resub_job_gromacs.sh`

module load impi
module load gromacs

mpirun -np 1 gmx_mpi mdrun -s input.tpr -cpi checkpoint.cpt

scancel $JOBID
exit 0

Compute Partitions

There are several different Slurm partitions for different purposes and having different characteristics. The partitions are collected into the following groups based on what they provide:

Which partition to choose?

If you do not request a partition, your job will be placed in the default partition, which is the NHR standard96 partition for all users. This means non-NHR users will have their jobs pending indefinitely with the reason PartitionConfig. Which partitions you have access to is listed on Types of User Accounts. For example new SCC users should specify --partition scc-cpu instead for their non-GPU batch jobs.

Many partitions have variants with a suffix like :SUFFIX appended to the (e.g. medium40:shared). The NAME:test partitions are, as the name suggests, intended for shorter and smaller test runs. These have a higher priority and a few dedicated nodes, but are limited in time and number of nodes.

For NHR users there are NAME:shared partitions, where it is possible to use less than a full node, which is suitable for pre- and postprocessing. All non-NHR partitions are usually already shared and do not enforce a reservation of entire nodes. A job running on a shared node is only accounted for its core/GPU/memory fraction it uses (the maximum of each resource is taken). All non-shared nodes are exclusive to one job, which implies that full compute cost per node are paid, even if fewer resources were requested.

The available home/local-ssd/work/perm storages are discussed in Storage Systems.

An overview of all partitions and node statuses is provided by running

sinfo -r

To see detailed information about a node’s type, run

scontrol show node <nodename>

CPU Partitions

Nodes in these partitons provide many CPU cores for parallelizing calculations.

Islands

The islands with a brief overview of their hardware are listed below.

Island	CPUs	Fabric
Emmy Phase 1	Intel Skylake	Omni-Path (100 Gb/s)
Emmy Phase 2	Intel Cascade Lake	Omni-Path (100 Gb/s)
Emmy Phase 3	Intel Sapphire Rapids	Omni-Path (100 Gb/s)
SCC Legacy	Intel Cascade Lake	Omni-Path (100 Gb/s)
CIDBN	AMD Zen2	Infiniband (100 Gb/s)
FG	AMD Zen3	RoCE (25 Gb/s)
SOE	AMD Zen2	RoCE (25 Gb/s)

Info

See Logging In for the best login nodes for each island (other login nodes will often work, but may have access to different storage systems and their hardware will be less of a match).

See Cluster Storage Map for the storage systems accessible from each island and their relative performance characteristics.

See Software Stacks for the available and default software stacks for each island.

Legacy SCC users only have access to the SCC Legacy island unless they are also CIDBN, FG, or SOE users in which case they also have access to those islands.

Partitions

The NHR partitions following the naming scheme sizeCORES[suffix] where size indicates the amount of RAM (medium, standard, large, or huge), CORES indicates the number of cores, and suffix is only included to differentiate partitions with the same size and CORES. SCC, KISSKI, REACT, etc. partitions do not follow this scheme. The partitions are listed in the table below by which users can use them and island without hardware details. See Types of User Accounts to determine which kind of user you are. Note that some users are members of multiple classifications (e.g. all CIDBN/FG/SOE users are also SCC users).

Users	Island	Partition	OS	Shared	Default/Max. Time Limit	Max. Nodes per Job	Core-hr per Core
NHR	Emmy P3	medium96s	Rocky 8		12/48 hr	256	0.75
		medium96s:test	Rocky 8		1/1 hr	64	0.75
		standard96s	Rocky 8		12/48 hr	256	1
		standard96s:shared	Rocky 8	yes	12/48 hr	1	1
		standard96s:test	Rocky 8		1/1 hr	64	1
		large96s	Rocky 8	yes	12/48 hr	2	2
		large96s:test	Rocky 8		1/1 hr	2	1.5
	Emmy P2	standard96	Rocky 8		12/48 hr	256	1
		standard96:shared	Rocky 8	yes	12/48 hr	64	1
		standard96:test	Rocky 8		1/1 hr	64	1
		large96	Rocky 8	yes	12/48 hr	2	2
		large96:test	Rocky 8		1/1 hr	2	1.5
SCC	Emmy P3	scc-cpu	Rocky 8	yes	12/48 hr	inf	1
	SCC Legacy	medium	Rocky 8	yes	12/48 hr	inf	1
all	Emmy P2 Emmy P1	jupyter	Rocky 8	yes	12/48 hr	1	1
CIDBN	CIDBN	cidbn	Rocky 8	yes	12 hr/14 days	inf
FG	FG	fg	Rocky 8	yes	12 hr/30 days	inf
SOEDING	SOE	soeding	Rocky 8	yes	12 hr/14 days	inf

Info

JupyterHub sessions run on the jupyter partition. This partition is oversubscribed (multiple jobs share resources) and is comprised of both CPU and GPU nodes.

Info

The default time limit for most partitions is 12 hours and failed jobs that are requested to run for longer will only get refunded for the 12 hours. This is detailed on the Slurm page about job runtime.

The hardware for the different nodes in each partition are listed in the table below. Note that some partitions are heterogeneous, having nodes with different hardware. Additionally, many nodes are in more than one partition.

Partition	Nodes	CPU	RAM per node	Cores	SSD
medium96s	380	2 × Sapphire Rapids 8468	256 000 MB	96	yes
medium96s:test	164	2 × Sapphire Rapids 8468	256 000 MB	96	yes
standard96	853	2 × Cascadelake 9242	364 000 MB	96
	148	2 × Cascadelake 9242	364 000 MB	96	yes
standard96:shared	853	2 × Cascadelake 9242	364 000 MB	96
	138	2 × Cascadelake 9242	364 000 MB	96	yes
standard96:test	856	2 × Cascadelake 9242	364 000 MB	96
	140	2 × Cascadelake 9242	364 000 MB	96	yes
standard96s	220	2 × Sapphire Rapids 8468	514 000 MB	96	yes
standard96s:shared	220	2 × Sapphire Rapids 8468	514 000 MB	96	yes
standard96s:test	224	2 × Sapphire Rapids 8468	514 000 MB	96	yes
large96	14	2 × Cascadelake 9242	747 000 MB	96	yes
	2	2 × Cascadelake 9242	1 522 000 MB	96	yes
large96:test	4	2 × Cascadelake 9242	747 000 MB	96	yes
large96s	30	2 × Sapphire Rapids 8468	1 030 000 MB	96	yes
	3	2 × Sapphire Rapids 8468	2 062 000 MB	96	yes
large96s:test	4	2 × Sapphire Rapids 8468	1 030 000 MB	96	yes
jupyter	16	2 × Skylake 6148	763 000 MB	40	yes
	8	2 × Cascadelake 9242	364 000 MB	96
medium	94	2 × Cascadelake 9242	364 000 MB	96	yes
scc-cpu	≤ 49	2 × Sapphire Rapids 8468	256 000 MB	96	yes
	≤ 49	2 × Sapphire Rapids 8468	514 000 MB	96	yes
	≤ 24	2 × Sapphire Rapids 8468	1 030 000 MB	96	yes
	≤ 2	2 × Sapphire Rapids 8468	2 062 000 MB	96	yes
cidbn	30	2 × Zen3 EPYC 7763	496 000 MB	128	yes
fg	8	2 × Zen3 EPYC 7413	512 000 MB	48	yes
soeding	7	2 × Zen2 EPYC 7742	1 000 000 MB	128	yes

The CPUs

For partitions that have heterogeneous hardware, you can give Slurm options to request the particular hardware you want. For CPUs, you can specify the kind of CPU you want by passing a -C/--constraint option to slurm to get the CPUs you want. Use -C ssd or --constraint=ssd to request a node with a local SSD. If you need a particularly large amount of memory, please use the --mem option to request an appropriate amount (per node). See Slurm for more information.

The CPUs, the options to request them, and some of their properties are give in the table below.

CPU	Cores	`-C` option	Architecture
AMD Zen3 EPYC 7763	64	`milan` or `zen3`	`zen3`
AMD Zen3 EPYC 7413	24	`milan` or `zen3`	`zen3`
AMD Zen2 EPYC 7742	64	`rome` or `zen2`	`zen2`
Intel Sapphire Rapids Xeon Platinum 8468	48	`sapphirerapids`	`sapphirerapids`
Intel Cascadelake Xeon Platinum 9242	48	`cascadelake`	`cascadelake`
Intel Skylake Xeon Gold 6148	20	`skylake`	`skylake_avx512`
Intel Skylake Xeon Gold 6130	16	`skylake`	`skylake_avx512`

Hardware Totals

The total nodes, cores, and RAM for each island are given in the table below.

Island	Nodes	Cores	RAM (TiB)
Emmy Phase 1	16	640	11.6
Emmy Phase 2	1,022	98,112	362.8
Emmy Phase 3	411	39,456	173.4
SCC Legacy	94	9,024	32.6
CIDBN	30	3,840	14.1
FG	8	384	3.9
SOE	7	896	6.6
TOTAL	1,588	152,352	605

GPU Partitions

Nodes in these partitons provide GPUs for parallelizing calculations. See GPU Usage for more details on how to use GPU partitions, particularly those where GPUs are split into MiG slices.

Islands

The islands with a brief overview of their hardware are listed below.

Island	GPUs	CPUs	Fabric
Grete Phase 1	Nvidia V100	Intel Skylake	Infiniband (100 Gb/s)
Grete Phase 2	Nvidia A100	AMD Zen 3 AMD Zen 2	Infiniband (2 × 200 Gb/s)
Grete Phase 3	Nvidia H100	Intel Sapphire Rapids	Infiniband (2 × 200 Gb/s)
SCC Legacy	Nvidia V100 Nvidia Quadro RTX 5000	Intel Cascade Lake	Omni-Path (2 × 100 Gb/s) Omni-Path (100 Gb/s)

Info

See Logging In for the best login nodes for each island (other login nodes will often work, but may have access to different storage systems and their hardware will be less of a match).

See Cluster Storage Map for the storage systems accessible from each island and their relative performance characteristics.

See Software Stacks for the available and default software stacks for each island.

Legacy SCC users only have access to the SCC Legacy island unless they are also CIDBN, FG, or SOE users in which case they also have access to those islands.

Partitions

The partitions are listed in the table below by which users can use them, without hardware details. See Types of User Accounts to determine which kind of user you are. Note that some users are members of multiple classifications (e.g. all CIDBN/FG/SOE users are also SCC users).

Users	Island	Partition	OS	Shared	Default/Max. Time Limit	Max. Nodes per job	Core-hours per GPU*
NHR	Grete P3	grete-h100	Rocky 8		12/48 hr	16	262.5
		grete-h100:shared	Rocky 8	yes	12/48 hr	16	262.5
	Grete P2	grete	Rocky 8		12/48 hr	16	150
		grete:shared	Rocky 8	yes	12/48 hr	1	150
		grete:preemptible	Rocky 8	yes	12/48 hr	1	47 per slice
NHR, KISSKI, REACT	Grete P2	grete:interactive	Rocky 8	yes	12/48 hr	1	47 per slice
KISSKI	Grete P3	kisski-h100	Rocky 8		12/48 hr	16	262.5
	Grete P2	kisski	Rocky 8		12/48 hr	16	150
REACT	Grete P2	react	Rocky 8	yes	12/48 hr	16	150
SCC	Grete P2 & P3	scc-gpu	Rocky 8	yes	12/48 hr	4	24
all	Grete P1 SCC Legacy	jupyter	Rocky 8	yes	12/48 hr	1	37 + 1 per core

Info

JupyterHub sessions run on the jupyter partition. This partition is oversubscribed (multiple jobs share resources) and is comprised of both CPU and GPU nodes.

Info

Partition	Nodes	GPU + slices	VRAM each	CPU	RAM per node	Cores
grete	35	4 × Nvidia A100	40 GiB	2 × Zen3 EPYC 7513	512 GiB	64
	14	4 × Nvidia A100	80 GiB	2 × Zen3 EPYC 7513	512 GiB	64
	2	4 × Nvidia A100	80 GiB	2 × Zen3 EPYC 7513	1 TiB	64
grete:shared	35	4 × Nvidia A100	40 GiB	2 × Zen3 EPYC 7513	512 GiB	64
	18	4 × Nvidia A100	80 GiB	2 × Zen3 EPYC 7513	512 GiB	64
	2	4 × Nvidia A100	80 GiB	2 × Zen3 EPYC 7513	1 TiB	64
	2	8 × Nvidia A100	80 GiB	2 × Zen2 EPYC 7662	1 TiB	128
grete:interactive	3	4 × Nvidia A100 (1g.10gb, 1g.20gb, 2g.10gb)	10/20 GiB	2 × Zen3 EPYC 7513	512 GiB	64
grete:preemptible	3	4 × Nvidia A100 (1g.10gb, 1g.20gb, 2g.10gb)	10/20 GiB	2 × Zen3 EPYC 7513	512 GiB	64
grete-h100	5	4 × Nvidia H100	94 GiB	2 × Xeon Platinum 8468	1 TiB	96
grete-h100:shared	5	4 × Nvidia H100	94 GiB	2 × Xeon Platinum 8468	1 TiB	96
kisski	34	4 × Nvidia A100	80 GiB	2 × Zen3 EPYC 7513	512 GiB	64
kisski-h100	15	4 × Nvidia H100	94 GiB	2 × Xeon Platinum 8468	1 TiB	96
react	22	4 x Nvidia A100	80 GiB	2 × Zen3 EPYC 7513	512 GiB	64
scc-gpu	1	4 × Nvidia H100	94 GiB	2 × Xeon Platinum 8468	1 TiB	96
	23	4 × Nvidia A100	80 GiB	2 × Zen3 EPYC 7513	512 GiB	64
	6	4 × Nvidia A100	80 GiB	2 × Zen3 EPYC 7513	1 TiB	64
	2	4 × Nvidia A100	40 GiB	2 × Zen3 EPYC 7513	512 GiB	64
jupyter	2	8 × Nvidia V100	32 GiB	2 × Cascadelake 6252	384 GiB	48
	3	4 × Nvidia V100	32 GiB	2 × Skylake Xeon Gold 6148	768 GiB	40
	5	4 × Nvidia RTX500	16 GiB	2 × Cascadelake 6242	192 GiB	32

Info

The actually available memory per node is always less than installed in hardware. Some is reserved by the BIOS, and on top of that, Slurm reserves around 20 GiB for the operating system and background services. To be on the safe side, if you don’t reserve a full node, always deduct ~30 GiB, divide by the number of GPUs and round down to get a number per GPU you can safely request with --mem when submitting jobs.

How to pick the right partition for your job

If you have access to multiple partitions, it can be important to choose one that fits your use case. As a rule of thumb, if you need (or can scale your job to run on) mutliple GPUs, and you are not using a shared partition, make sure to always use a multiple of 4 GPUs, as you will be billed for the whole node regardless of a lower number of GPUs you requested via -G. For jobs that need less than 4 GPUs, use a shared partition and make sure to not request more than your fair share of RAM (see the note above). If you need to get your to start quickly, i.e. for testing if your scripts work or interactive tweaking of hyperparameters, use an interactive partition (most users have access to grete:interactive).

The CPUs and GPUs

For partitions that have heterogeneous hardware, you can give Slurm options to request the particular hardware you want. For CPUs, you can specify the kind of CPU you want by passing a -C/--constraint option to slurm to get the CPUs you want. For GPUs, you can specify the name of the GPU when you pass the -G/--gpus option (or --gpus-per-task) and larger VRAM or a minimum in CUDA Compute Capability using a -C/--constraint option. See Slurm and GPU Usage for more information.

The GPUs, the options to request them, and some of their properties are given in the table below.

GPU	VRAM	FP32 Cores	Tensor Cores	`-G` option	`-C` option (VRAM)	`-C` option (CUDA Compute Cap.)	Compute Cap.
Nvidia H100	94 GiB	8448	528	`H100`	`94gb_vram`, `80gb_vram`, `40gb_vram`, `30gb_vram`, `20gb_vram`, `10gb_vram`	`70cuda_arch`, `75cuda_arch`, `80cuda_arch`, `90cuda_arch`	90
Nvidia A100	40 GiB	6912	432	`A100`	`80gb_vram`, `40gb_vram`, `30gb_vram`, `20gb_vram`, `10gb_vram`	`70cuda_arch`, `75cuda_arch`, `80cuda_arch`	80
	80 GiB	6912	432	`A100`	`80gb_vram`, `40gb_vram`, `30gb_vram`, `20gb_vram`, `10gb_vram`	`70cuda_arch`, `75cuda_arch`, `80cuda_arch`	80
1g.10gb slice of Nvidia A100	10 GiB	864	54	`1g.10gb`	`10gb_vram`	`70cuda_arch`, `75cuda_arch`, `80cuda_arch`	80
1g.20gb slice of Nvidia A100	20 GiB	864	54	`1g.20gb`	`20gb_vram`, `10gb_vram`	`70cuda_arch`, `75cuda_arch`, `80cuda_arch`	80
2g.10gb slice of Nvidia A100	10 GiB	1728	108	`2g.10gb`	`10gb_vram`	`70cuda_arch`, `75cuda_arch`, `80cuda_arch`	80
Nvidia V100	32 GiB	5120	640	`V100`	`30gb_vram`, `20gb_vram`, `10gb_vram`	`70cuda_arch`	70
Nvidia Quadro RTX 5000	16 GiB	3072	384	`RTX5000`			75

The CPUs, the options to request them, and some of their properties are give in the table below.

CPU	Cores	`-C` option	Architecture
AMD Zen3 EPYC 7513	32	`zen3` or `milan`	`zen3`
AMD Zen2 EPYC 7662	64	`zen2` or `rome`	`zen2`
Intel Sapphire Rapids Xeon Platinum 8468	48	`sapphirerapids`	`sapphirerapids`
Intel Cascadelake Xeon Gold 6252	24	`cascadelake`	`cascadelake`
Intel Cascadelake Xeon Gold 6242	16	`cascadelake`	`cascadelake`
Intel Skylake Xeon Gold 6148	20	`skylake`	`skylake_avx512`

Hardware Totals

The total nodes, cores, GPUs, RAM, and VRAM for each island are given in the table below.

Island	Nodes	GPUs	VRAM (TiB)	Cores	RAM (TiB)
Grete Phase 1	3	12	0.375	120	2.1
Grete Phase 2	103	420	27.1	6,720	47.6
Grete Phase 3	21	84	7.9	2,016	21
SCC Legacy	7	36	0.81	176	1.7
TOTAL	134	552	36.2	9,032	72.4

Spack

Getting started

The official spack documentation can be found at https://spack.readthedocs.io/ (the correct version can be opened by clicking on the version number in the table below)

In order to use Spack, you need to load the corresponding module first. The module name for each software stack that has a Spack module is listed below:

Software Stack	Module Name	Version
GWDG Modules (gwdg-lmod) (default)	`spack`	0.23.1
SCC Modules (scc-lmod)	`spack-user`	0.21.0
NHR Modules (nhr-lmod)	`spack`	0.21.2

Then, load it by

load spack module:

module load spack

module load spack-user

module load spack

Please note: Some commands of Spack require extra shell functionality and you need to source the environment script:

source $SPACK_ROOT/share/spack/setup-env.sh

This command will also be shown every time you load the spack module to remind you.

Now Spack is ready and you can use spack load command to load necessary software or spack install SPEC to install the software.

Loading software

You can find spack packages that are already installed using spack find:

spack find py-numpy

You may need to pick a specific version of those listed to load:

spack load py-numpy%gcc@14.2.0 target=sapphirerapids

Installing software

In most cases, it is enough to know the package name to install the software. For instance, if you want to install zlib, then you can simply run the following command:

spack install zlib

In general, you need to provide a specification and dependencies for the command spack install, where you can select versions, compiler, dependencies and variants. To learn more about Spack specification, please visit Spack’s documentation.

Hardware Architectures

Since we have multiple CPU architectures, connection fabrics, and GPU architectures on the clusters; it can pay off to optimize your software for the architecture of the nodes it will run on. For example, if you plan on running your software on a Cascadelake node, it can be accelerated by compiling it to use Cascadelake’s AVX512 instructions. A package would be compiled by setting the target in the spec to the desired CPU architecture like:

spack install gromacs target=cascadelake

The spack arch command will print the full CPU and OS architecture/target of the node you are on (e.g. linux-rocky8-sapphirerapids), and spack find will show you what you have built for each architecture (architecture is in the headers). The architecture/target is composed of the operating system (linux), the Linux distribution, and the CPU architecture. Note that the architecture/target does not capture other important hardware features like the fabric (mainly MPI libraries and their dependencies) and CUDA architecture. For CUDA, the cuda_arch parameter should be set to the CUDA compute capability and the +cuda variants enabled. For the MPI libraries, you should try to use the ones already installed by our staff that are tested and optimized to work well on the system. We recommend using the default openmpi variant with our default GCC compiler. The default can be found by executing module load gcc openmpi followed by module list, which will give you the recommended version numbers as part of the module names.

Make sure to install the software separately for every architecture you want it to run on from a node with that particular architecture. The easiest way to ensure you are on the right architecture is to start an interactive slurm job on the same partition (and same kind of node if a partition is mixed architecture) you want to use the software on. To learn more about Spack and how to install software you can go through its tutorial at https://spack-tutorial.readthedocs.io/en/latest/

Tip

Software installed as modules is already built for all targets separately. The correct version is chosen automatically by module load for the node it is running on. This makes sure that the spack or spack-user module has right compilers and default configuration selected for the node.

Warning

Cross-compiling packages for a different CPU architectures than the node spack is running on is error prone when it is possible (some combinations are impossible) and should be avoided when possible. The one exception to this is compiling packages for a compatible CPU with less features than the CPU on the node spack is running on (e.g. compiling for skylake_avx512 on a cascadelake node), but even this requires care. Also, cross-linux-distro builds (compiling for rocky8 on centos7) are outright impossible with Spack.

The nodes currently supported for Spack and their architectures organized by cluster island are given in the table below.

Nodes	CPU and OSArchitecture/Target	Fabric	cuda_arch
Emmy Phase 3	`linux-rocky8-sapphirerapids`	OPA
Emmy Phase 2	`linux-rocky8-cascadelake`	OPA
Emmy Phase 1	`linux-rocky8-skylake_avx512`	OPA
Grete Phase 3	`linux-rocky8-sapphirerapids`	IB	90
Grete Phase 2	`linux-rocky8-zen2`	IB	80
Grete Phase 1	`linux-rocky8-skylake_avx512`	IB	70
SCC Legacy (CPU)	`linux-rocky8-cascadelake` (medium)	OPA none (Ethernet only)
SCC Legacy (GPU)	`linux-rocky8-cascadelake`	OPA	70
CIDBN	`linux-rocky8-zen3`	IB
FG	`linux-rocky8-zen3`	RoCE
SOE	`linux-rocky8-zen2`	RoCE

Info

Note that to reduce the total number of separate architectures, some are grouped together and rounded down to the lowest common denominator for CPU architectues and the minimum for CUDA architectue. For example, the lowest common denominator of CPU architectures zen2 and zen3 is zen2, and CUDA architectures 70 and 75 is 70.

See CPU partitions and GPU partitions for the Slurm partitions in each island.

Spack FAQs

Frequently Asked Questions (FAQs) about Spack

1. What is Spack?

Question: What is Spack?

Answer: Spack is a package manager designed for high-performance computing (HPC) applications. It simplifies the process of building and installing multiple versions of software on various platforms.

2. How do I install Spack?

Question: How do I install Spack?

Answer: To install Spack, you can clone the Spack repository from GitHub and add it to your environment. Use the following commands:

git clone https://github.com/spack/spack.git
. spack/share/spack/setup-env.sh

However on our systems, you don’t have to install Spack, it is already provided as a module, you can just load it with:

load spack module:

module load spack

module load spack-user

module load spack

3. How do I search for packages in Spack?

Question: How do I search for packages in Spack?

Answer: You can search for packages in Spack using the spack list command followed by the package name or a part of the package name. For example:

spack list <package-name>

4. How do I install a package using Spack?

Question: How do I install a package using Spack?

Answer: To install a package, use the spack install command followed by the package name. For example, to install HDF5, you would run:

spack install hdf5

5. How do I load a package installed with Spack?

Question: How do I load a package installed with Spack?

Answer: You can load a package into your environment using the spack load command. For example:

spack load hdf5

6. How do I create a custom package in Spack?

Question: How do I create a custom package in Spack?

Answer: To create a custom package, you need to write a package recipe in Python. This involves defining the package’s metadata and build instructions in a file within the var/spack/repos/builtin/packages directory. Detailed instructions can be found in the Spack documentation.

7. How do I update Spack to the latest version?

Question: How do I update Spack to the latest version?

Answer: You can update Spack by pulling the latest changes from the GitHub repository:

cd $SPACK_ROOT
git pull

Then, you may need to run Spack commands to update its database and environment.

Note

If you use spack loaded with the module command, you don’t need to update it. It uses exactly the version and configuration needed.

8. How do I report a bug or request a feature in Spack?

Question: How do I report a bug or request a feature in Spack?

Answer: You can report bugs or request features by opening an issue on the Spack GitHub repository. Be sure to provide detailed information about the bug or feature request.

9. Where can I find more information about Spack?

Question: Where can I find more information about Spack?

Answer: More information about Spack, including comprehensive documentation and tutorials, can be found on the official Spack website and the Spack documentation.

Storage Systems

In contrast to a personal computer, which often presents the user with a unified view of the file system, the different storage systems on a HPC cluster are exposed to the users. This allows users that are familiar with these systems to achieve the optimal performance and efficiency. You can check the cluster storage map for a visual overview.

Data Store	Data Sharing	Typical Limits	Purpose	Notes
$HOME	Private	60 GiB 7 M files	Config files, helper scripts, small software installations
Job Specific Storage	Private	-	High performance data access for individual job	Deleted after job finishes
Workspaces	Private or Project	10-40 TiB 2 M files	High performance data access for multiple jobs	Has expiration date, no backups
$PROJECT	Project	3 TiB 7 M files	Software installations, miscellaneous data	Replaces the purpose of $COLD for non-NHR users
$COLD	Project	12 TiB 1 M files	Data no longer actively used by compute jobs	NHR only
Tape Storage	Project	8 TiB 8000 files	Archived data	NHR only, not a 10 year long-term archive

Specific locations on the different storage systems are reserved for individual users or projects. These are called data stores. To see the available data stores you can use the show-quota command on one of our login nodes. It will also display the quota values that limit the amount of data you can store in particular locations.

By default, users only get one personal data store, their $HOME directory. It is very limited in size and only meant for configuration files, helper scripts and small software installations that do not take up much space.

Job-specific Directories

Each compute job gets a number of job-specific directories that allow for very high performant file operations during the job. If possible, you should use these directories. They get cleaned up after the job finishes, so results have to be copied to a different location at the end of your job.

Workspaces

Users can request additional personal data stores using the workspaces system. A workspace allows users for example to run a series of jobs that depend on each other and that need access to a shared filesystem for parallel I/O operations. After a job series has finished, the results should be moved to a more permanent location and the workspace should be released. Workspaces that are not released by the user will be cleaned up automatically when their expiration date has been reached.

Note

The file systems with the highest parallel performance (Lustre, BeeGFS) are typically only mounted on individual cluster islands. If you need speedy access to files from multiple cluster islands simultaneously, you can instead use a Ceph SSD workspace.

Project Specific Data Stores

Most of the more permanent data stores are project-specific. These are used in collaboration with other users of the same project.

You should coordinate with the other members of your project on how you use these storage locations. For example, you could create a sub-directory for every work-package, have directories for individual users, or all work in a single directory.

Ultimately all files in a project belong to the project’s PIs and their delegates, who can also request permission and ownership changes by our staff.

Note

These project-specific data stores are not available for NHR test accounts. Such users can instead request a workspace on the ceph-hdd storage system which has a longer max lifetime compared to the other workspace locations. Please apply for a full NHR project if you need a permanent location for larger amounts of data.

Data Store for Software Installation

The PROJECT data store, which can be located using the $PROJECT variable, can be used for larger software installations like larger conda environments.

Data Store for Cold Data

Our ceph-hdd storage system is perfectly suited for large amounts of data that are no longer actively used by compute jobs. For SCC projects this is again the $PROJECT data store, for NHR it is $COLD. It is not available for KISSKI.

Data Store for AI/Machine Learning

The VAST filesystem is designed to work well with AI workloads. The $PROJECT data stores for NHR and KISSKI projects are located here.

Other Options

If the above options are not suitable for your project and you need, for example, a permanent storage location on a particular file system (Lustre, BeeGFS, VAST), please contact our support with a detailed explanation of your use case.

Technical Documentation

We provide extensive technical documentation on our storage systems suitable for more experienced HPC users.

Cluster Storage Map

Not every data store is available everywhere on the cluster, nor is the filesystem performance equal for all nodes. The differences depend on which island a node is a member of and whether it is a login or a compute node. General CPU node islands are called “Emmy Phase X” where X indicates the hardware generation (1, 2, 3, …). General GPU node islands are called “Grete Phase X” where X indicates the hardware generation (1, 2, 3, …). Other islands exist for specific institutions/groups or historical reasons (e.g. SCC Legacy). Which systems can be accessed and their relative performance of the link between each group/island of nodes (logins and partitions) and each data store is shown in the diagram below

Diagram of the connections between each HPC node group and the storage systems. The SCC Legacy login nodes (gwdu[101-102]) have a very slow connection to the GWDG ARCHIVE (Stornext AHOME). The MDC login nodes (Emmy Phase 2, glogin[4-8]) and RZG login nodes (Grete and Emmy Phase 3, glogin[9-13]) have a very slow connection to PERM. All nodes have a slow-medium connection to the HOME/Project storage (VAST/CephFS), COLD (CephFS), Workspaces Ceph (CephFS), and Software. The SCC Legacy login nodes (gwdu[101-102]) and SCC Legacy compute nodes have a slower-medium connection to the GWDG Unix HOME (Stornext) and a very fast connection to the SCRATCH SCC (BeeGFS). The MDC login nodes (Emmy Phase 2, glogin[4-8]) and Emmy Phase 2 compute nodes have a very fast connection to Workspaces Lustre MDC (Lustre) and a medium connection to the SCRATCH SCC (BeeGFS). The RZG login nodes (Grete and Emmy Phase 3,glogin[9-13]), Grete compute nodes, and Emmy Phase 3 compute nodes have a very fast connection to the Workspaces + SCRATCH RZG (formerly known as SCRATCH Grete, Lustre) and and a medium connection to the SCRATCH SCC (BeeGFS). — HPC Storage Systems
Connections between each HPC node group/island and the different storage systems, with the arrow style indicating the performance (see key at bottom right).

There are a few important points to note:

Portal SCC users have access to some CPU and GPU nodes in the Grete Phase 2 & 3 and Emmy Phase 3 islands (the scc-cpu and scc-gpu partitions) in addition to the SCC Legacy island.
Legacy SCC users only have access to the SCC Legacy island and have their HOME directories on the GWDG Unix HOME filesystem.
Each island (SCC Legacy, Emmy Phase 2, Grete & Emmy Phase 3) has a separate SCRATCH/WORK data store connected via a very fast network, with slower connections to the other ones.
ARCHIVE/PERM data stores are only accessible from login nodes.
Temporary Storage for Slurm jobs is always allocated in RAM, on fast local SSDs, and the fastest SCRATCH/WORK available for the node.
The CIDBN, FG, and SOE islands are not shown in the diagram above but have access to the same storage systems with the same relative performance as the SCC Legacy, though some also have their own dedicated SCRATCH/WORK.

Info

See CPU Partitions and GPU Partitions for the available partitions in each island for each kind of account.

See Logging In for the best login nodes for each island (other login nodes will often work, but may have access to different storage systems and their hardware will be less of a match).

See Software Stacks for the available and default software stacks for each island.

See Types of User Accounts if you are unsure what kind of account you have.

Quota

To see the quotas for any data stores your current user has access to, use the command show-quota (for more details, see the dedicated page below).

Fundamentals

Quotas are used to limit how much space and how many files users and projects can have in each data store. This prevents things like a runaway or buggy job using up the filesystem’s entire capacity rendering it unusable for other users and projects on the cluster, as well as preventing steady accumulation of unused files. Data stores have two different kinds of quotas which may be set:

Block

Quota limiting how much space can be used (i.e. how many blocks on the filesystem).
Note that because filesystems store files in blocks, the usage rounds up to the nearest block in many cases (e.g. a 1 byte file often takes up 4 KiB of space).

Inode

Quota limiting how many files, directories, and links can be used.
In most filesystems, every file, directory, and link takes up at least 1 inode.
Large directories and large files can sometimes take up more than 1 inode.

For each kind of quota, filesystems have two different kinds of limits – hard and soft limits where the soft limit, if present, is less than or equal to the hard limit.

Hard limit

When one is above the hard limit, the filesystem prevents consuming more blocks/inodes and may restrict what changes can be done to existing blocks/inodes until one drops below the soft limit.

Info

Some data stores only support hard limits. On these filesystems, one must be more careful since there will be no warning message until one hits the hard limit where using more blocks/inodes fails outright.

Soft limit

When one is above the soft limit but below the hard limit, a timer starts giving the user/project a certain grace time to drop below the soft limit again. Before the grace time expires, the user/project is not restricted so long as they stay below the hard limit. If the grace time expires without dropping below the soft limit, the filesystem restricts the user/project the same as exceeding the hard limit until the usage drops below the soft limit. Often a warning is printed to stderr when the soft limit is first exceeded, but there is no message via E-Mail warning about reaching the soft limit.

Essentially, the soft limit is the value one must stay below but one may briefly exceed without problems so long as one does not exceed the hard limit.

Info

The grace time is currently 2 weeks on all data stores.

Quota Pages

The different quota sub-topics are covered in the pages below

Checking Usage And Quotas

show-quota

The show-quota program is provided for checking your usage and quotas on the cluster. The data it uses is updated periodically (at least once per hour), so it displays the usage and quota when its data was last updated, not at the current instant.

Info

For legacy SCC users, the full show-quota command is not available and instead only shows a very short summary of your HOME quota.

By default, show-quota shows the usage and quotas for every data store of the calling user and all projects the user is in. There are several command line options that can be used to control which projects to show (-p), disable showing user or project information (--no-user and --no-project), and change the output format. The best way to see all available options is to use the -h or --help options. Its current help message is:

glogin8:~ $ show-quota --help
usage: show-quota [-h] [-u USER] [-p [PROJECT [PROJECT ...]]] [--no-user]
                  [--no-project] [--ascii] [--no-color] [--json] [--simple]
                  [--short] [--yaml-like] [--warn]

Tool for getting user/project filesystem usage and quotas, which are printed
to stdout. Errors with getting users or projects or reading their information
are printed to stderr prefixed by '! '. The exit code is 0 if there were no
failures, and 1 otherwise (will always have some output on stderr).

optional arguments:
  -h, --help            show this help message and exit
  -u USER, --user USER  The user to get the usage and quotas for. Supports
                        names and UIDs. Default is the calling user.
  -p [PROJECT [PROJECT ...]], --project [PROJECT [PROJECT ...]]
                        The project/s to get the usage and quotas for.
                        Supports names and GIDS. The default is all projects
                        the user (calling user unless --user is given) is in.
  --no-user             Don't print the user usage and quotas.
  --no-project          Don't print the project usage and quotas.
  --ascii               Force use of only ASCII characters (no unicode).
  --no-color            Disable color even if stdout would support it.
  --json                Print in JSON format.
  --simple              Print in the simple format.
  --short               Print in the short format.
  --yaml-like           Print in the YAML-like format (not actually valid
                        YAML).
  --warn                Print in the warning format, which is like --short
                        except that it only prints for data stores where the
                        quota is exceeded or close to it.

The default, --simple, --short, --warn, and --yaml-like output formats are meant to be human readable. The --json format is machine readable and is therefore useful in scripts.

Default Format

The default format shows all information in a human readable way along with a usage bar. Its output takes the form of:

User USERNAME
  Data Store DATA_STORE_1
    ...
  ...
  Data Store DATA_STORE_N
    ...
Project PROJECT_1
  Data Store DATA_STORE_1
    ...
  ...
  Data Store DATA_STORE_N
    ...
...
Project PROJECT_N
  Data Store DATA_STORE_1
    ...
  ...
  Data Store DATA_STORE_N
    ...

where the ellipses (...) denote additional input not included here for brevity. The output for each data store prints out the following items:

The data store name.
The data store’s alias if it has one.
A description of the data store.
The filesystem path/s to the directory/ies belonging to the user/project (each will be on its own line).
The underlying filesystem, possibly with additional information.
Block usage and quota, which will consist of lines indicating the usage, limits (if present), grace time if applicable, an indication of being over limits if applicable, the last time the data was updated, and a usage bar if there is at least one limit.
Inode usage and quota in the same format as for block usage and quota.

An example when a user is below both the soft and hard limits for a data store would be

Example usage and quota output for a data store where the usage is below both the soft and hard limit with color and unicode. See the other tab for its output in textual ASCII form. — Usage and quota when below both limits with color and unicode.

  Data Store lustre-grete
    SCRATCH RZG
    Scratch storage in the RZG.
    /mnt/lustre-grete/usr/u11065
    /mnt/lustre-grete/tmp/u11065
    Lustre filesystem with PROJID 2148257104
    Block
      Used 90.00% (9.00 GiB / 10 GiB, soft limit)
        [##################################################################------]
      Hard limit is 6 TiB (0.15% used)
      Last updated 28.4 min ago (2024-05-27T11:21:40.376630+00:00)
    Inodes
      Used 0.00% (3 / 1.05 M, soft limit)
        [------------------------------------------------------------------------]
      Hard limit is 2.10 M (0.00% used)
      Last updated 28.4 min ago (2024-05-27T11:21:40.376630+00:00)

With color enabled, the usage bar (and usage relative to a limit) use green for under 85%, yellow/orange for 85-100%, and bold red for 100% and higher. No bar is shown if there is no soft nor hard limit.

An example when a project is between the soft and hard limits for blocks with an expired grace time and has no limits on inodes would be

Example usage and quota output for a data store where the usage is between limits for blocks and has no limit for inodes with color and unicode. See the other tab for its output in textual ASCII form. — Usage and quota when between limits for blocks and having no limits on inodes with color and unicode.

  Data Store project
    PROJECT
    Project storage
    /home/projects/nhr_internal
    gpfs filesystem
    Block
      Over soft limit (grace time expired)
      Used 125.00% (50 GiB / 40 GiB, soft limit)
      Hard limit is 100 GiB (50.00% used)
        [############################|#######------------------------------------]
      Last updated 30.1 min ago (2024-05-27T11:20:01.498979+00:00)
    Inodes
      Used 2 (no limit)
      Last updated 30.1 min ago (2024-05-27T11:20:01.498979+00:00)

Notice how the output explicitly shows you that the user is over the soft limit and the grace time has expired.

An example when a user has exceeded the hard limit for blocks would be

Example usage and quota output for a data store where the usage is over the hard limit on blocks and has no limit on inodes with color and unicode. See the other tab for its output in textual ASCII form. — Usage and quota when over the hard limit on blocks with color and unicode.

  Data Store HOME
    /home/fnordsi1/u11065
    gpfs filesystem
    Block
      Over hard limit (grace time expired)
      Used 300.00% (120 GiB / 40 GiB, soft limit)
      Hard limit is 100 GiB (120.00% used)
        [############################|###########################################]++++++++++++++
      Last updated 29.2 min ago (2024-05-27T11:20:51.230354+00:00)
    Inodes
      Used 12 (no limit)
      Last updated 29.2 min ago (2024-05-27T11:20:51.230354+00:00)

Notice how any overshoot past the hard limit is shows extending past the end of the bar. Overshoot past the hard limit can happen since it can take a while for parallel distributed filesystems to notice the hard limit has been exceeded. This is also how it would look like if there is a soft limit but no hard limit and the soft limit is exceeded.

Simple Format

Use the --simple option to use the simple format. The simple format is identical to the default format except that the usage bars are not shown.

Short Format

Use the --short option to use the short format. The short format is the default format minus the usage bar, data store alias and dsecription, paths, and underlying filesystem information.

An example where a project is between the soft and hard limits on blocks and has no limit on inodes would be

Example usage and quota output in the short format for a data store where the usage is between limits for blocks and has no limit for inodes with color and unicode. See the other tab for its output in textual ASCII form. — Usage and quota in the short format when between limits for blocks and having no limits on inodes with color and unicode.

  Data Store project
    Block
      Over soft limit (grace time expired)
      Used 125.00% (50 GiB / 40 GiB, soft limit)
      Hard limit is 100 GiB (50.00% used)
      Last updated 27.9 min ago (2024-05-27T11:50:02.151219+00:00)
    Inodes
      Used 2 (no limit)
      Last updated 27.9 min ago (2024-05-27T11:50:02.151219+00:00)

Warn Format

Use the --warn option to use the warn format. The warn format is identical to the short format except that only data stores where the usage is 85% or greater than the soft and/or hard limits are shown, any output is indented by two more spaces, and the first line printed to stdout is Over quota/s or close to them: if any data stores are close to or have exceeded a soft and/or hard limit. If the user or a project has no data stores over a limit or close to a limit, nothing is printed for the user or that project.

Tip

If you and your projects are not over or close to the limits on any data store, show-quota --warn will print nothing.

Fixing Quota Issues

The most common quota problem is being close to the quota or exceeding it.

The other common problem is that show-quota (see Checking Usage And Quotas) doesn’t show your usage and quotas or has very out of date data. In this case, please contact support so that we can fix it (see Support for the email address to use).

This page, however, will focus on the most common problem – exceeding quota or being close to it.

Determine Where Your Usage Is

The filesystem quotas and show-quota work on whole directories and don’t indicate anything about where in a directory the usage is. Often times, a lot of the usage is old files and directories or caches that are no longer needed and can be cleaned. Use show-quota to get the path/s to the your (or your project’s) directory/ies in the data store. Then, you can determine where your usage is in those directories: either manually with ls and du, or using its Terminal User Interface (TUI) frontend ncdu.

TUI frontend

On the login nodes, call ncdu with the root directory you want to analyze as the first argument:

ncdu <PATH>

There you can graphically see the space used up by each directory.

A screenshot of the default ncdu view after opening a folder, in this case a public data pool. For each entry, the total GiB of that subtree can be seen, after which a visual percentage representations follows. Afterwards, the name of the entry is shown. — NCDU Screenshot (Default View)
A screenshot of the default ncdu interface

Bindings can be seen by pressing ?

A screenshot of the help view, accessed via pressing the question mark key. It shows the binding for different operations, such as moving the cursor (supporting both normal and vim-like bindings) as well as different sorting crieria, such as name, size, or amount of items. — NCDU Screenshot (Help View)
A screenshot of the help ncdu interface

The most important bindings are

Key	Action
↑ / k	Move up (previous item)
↓ / j	Move down (next item)
→ / l / Enter	Open directory / show contents
← / h	Go up one directory
d	Delete the selected item
n	Sort by name
s	Sort by size (default)
C	Sort by item count
g	Show percentage and/or graph
?	Help (show key bindings)
q	Quit `ncdu`

To analyze inode usage, use the manual method described below.

Manually debugging

If you are in a directory, you can run

ls -alh

to list all files and directories including hidden ones (the -a option) along with the sizes of all the files in a human readable format (the -l and -h options put together). The -a option is particularly useful in your HOME directory where there are many hidden files and directories. An example of its output would be

glogin9:/scratch-grete/usr/gzadmfnord $ ls -alh
total 79M
drwx------    4 gzadmfnord gzadmfnord 4.0K May 27 15:09 .
drwxr-xr-x 4606 root       root       268K May 27 10:50 ..
lrwxrwxrwx    1 gzadmfnord gzadmfnord    4 May 27 15:09 .a -> .baz
drwxr-x---    2 gzadmfnord gzadmfnord 4.0K May 27 15:08 bar
-rw-r-----    1 gzadmfnord gzadmfnord  78M May 27 15:08 .baz
drwxr-x---    2 gzadmfnord gzadmfnord 4.0K May 27 15:08 foo

In this case, if the -a option had not been used, the 78 MiB file .baz would have been completely overlooked. Notice that ls -alh also shows the total size of all files directly in the directory (not in subdirectories), the permissions, and the targets of symlinks.

One of the limitations of ls is that it does now show the total amount of space and inodes are used inside directories. For that, you need to use du. To get the block usage for one or more directories, run

du -sch DIR1 ... DIRN

and for inodes

du --inodes -sch DIR1 ... DIRN

where the -s option generates a summary of the total usage for each directory, the -c prints the sum-total of all the given directories, and -h converts the usage to human readable usage (e.g. converting 1073741824 bytes into 1G).

A quick check of the size of the current directory using this command would be

du -d 1 -h

where the option -d 1 limits the calculation to this directory.

An example with a user’s two directories in the scratch-grete data store would be

glogin9:~ $ du -sch /scratch-grete/usr/$USER /scratch-grete/tmp/$USER
79M     /scratch-grete/usr/gzadmfnord
60K     /scratch-grete/tmp/gzadmfnord
79M     total

and for inodes, it would be

glogin9:~ $ du --inodes -sch /scratch-grete/usr/$USER /scratch-grete/tmp/$USER
5       /scratch-grete/usr/gzadmfnord
1       /scratch-grete/tmp/gzadmfnord
6       total

Reducing Filesystem Usage

After determinging where your high usage is, there are several things that can be done to reduce usage.

First, if the files/directories are no longer needed or are trivially regenerated, they can simply be deleted. In particular, the ~/.cache directory is a common culprit for exceeding the quota of one’s HOME directory since many tools cache files there and never clean up (conda seems to be the most common culprit). It is often safe to outright delete the ~/.cache directory or any of its sub-directories. Temporary files also tend to accrue in /scratch-*/tmp/USERNAME and /mnt/lustre-*/tmp/USERNAME directories, so they are good candidates for cleanup.

Second, the files/directories could be moved to somewhere else. Are you using your user’s data stores for data that should really be in the project’s data stores (projects get bigger quotas)? Another example are files/directories residing on a SCRATCH/WORK that won’t be used again for a long time, and therefore could be moved to an ARCHIVE/PERM data store or downloaded to your machine/s to be re-uploaded later (obviously, special considerations must be taken for very large data sets). Another example would be moving files/directories needing less IO performance from an SSD based SCRATCH/WORK data store to a larger HDD based one.

Third, the files/directories could be compressed or bundled together in some way. Compression can reduce the block usage depending on the kind of data and the compression algorithm chosen, and can even improve read performance afterwards in some cases. Bundling many files/directories together can greately reduce inode usage (e.g. packing a directory with a million files into a single TAR or ZIP file). Common methods include:

tar, zip, etc. for bundling and compressing files and directories.
Changing file formats to different ones that store the same data but smaller. For example, PNG will almost always store an image in less space than BMP will (most BMP images are either uncompressed or use a very weak compression algorithm).
Concatenating files and making an index of the byte offsets and sizes of each to reduce the inode usage. A common example is concatenating many JPEG files making a MJPG (Motion JPEG) file which is supported by many tools directly (e.g. ffmpeg can read them just like any other video format).
Putting the data from several files and/or directories in an HDF5 or NetCDF4 file with compression on the Datasets/Variables.
Converting directories of thousands of images into a multi-page image format (e.g. multi-page TIFF) or a video. Note that videos can use the similarities between successive frames to compress. Be careful to choose an appropriate video codec for your data if you do this. You may want either a lossless codec (even if your original image frames are lossy), a codec put into its lossless mode, or a codec tuned to be low loss in order to avoid compounding losses (lossy JPEG then put through a lossy image codec). Encoding and decoding speed can be important as well.

Tip

A directory with many thousands or millions of small files generally has terrible performance anyways, so bundling them up into a much smaller number of larger files can improve performance by an order of magnitude or more.

Increasing Quota

Sometimes even after following the tips and steps in Fixing Quota Issues, one might still need more blocks and/or inodes and thus an increase in quota.

If you need a larger quota, non-NHR users should contact support (see Support for the right email address). NHR users are encouraged to first contact your consultant to discuss your needs and how we can help you, after which you would contact support at nhr-support@gwdg.de.

Workspaces

Workspaces allow users to request temporary storage with an expiration date on the high performance filesystems. This increases the efficiency and overall performance of the HPC system, as files that are no longer required for compute jobs do not accumulate and fill up these filesystems.

The typical workflow is that the workspaces commands are used to request and create a directory on the appropriate filesystem before a new chain of compute jobs is launched. After the job chain has finished, the results should be copied to a more permanent location in a data store shared with other project members.

Each project and project user can have multiple workspaces per data store, each with their own expiration dates. After the expiration date, the expired workspace and the data within it are archived in an inaccessible location and eventually deleted.

The concept of workspaces has become quite popular and is applied in a large number of HPC centers around the world. We use the well known HPC Workspace tooling from Holger Berger to manage workspaces.

Info

All Slurm jobs get their own temporary storage directories on the nodes themselves and the fastest shared filesystems available to the particular nodes, which are cleaned up when the job is finished. If you only need temporary storage for the lifetime of the job, those directories are better suited than workspaces. See Temporary Storage for more information.

Workspaces are meant for active data and are configured for high aggregate bandwidth (across all files and nodes at once) at the expense of robustness. The bandwidth for single files varies based on the underlying filesystem and how they are stored, please check the filesystem pages for more details.

Common workflows are to use workspaces for temporary data, or store data in a compact form in a Project or COLD data store and then copy and/or decompress it to a higher performance workspace to be used for a short period of time. The workspaces have a generous quota for the whole project (the finite lifetimes of workspaces help protect against running out of space as well).

The data stores for workspaces available to each kind of project are:

Kind of Project	Name	Filesystem	Purpose
SCC, NHR, REACT	Ceph HDD WS (`ceph-hdd`)	CephFS	Medium-term storage for NHR Test Accounts and for projects that have a temporary demand for more storage space
SCC, NHR, REACT	Ceph SSD WS (`ceph-ssd`)	CephFS	Significantly faster than Ceph HDD, useful for jobs that need storage available from all Cluster Islands
NHR	Lustre RZG WS (`lustre-rzg`)	Lustre	High performance WORK filesystem for users of Grete and Emmy P3
NHR	Lustre MDC WS (`lustre-mdc`)	Lustre	High performance WORK filesystem for users of Emmy P2

Workspace Basics

The six basic commands to handle workspaces and manage their lifecycles are:

Command	Description
`ws_allocate`	Create a workspace
`ws_extend`	Extend a workspace’s expiration date
`ws_list`	List all workspaces or available data stores for them
`ws_release`	Release a workspace (files will be deleted after a grace period)
`ws_restore`	Restore a previously released workspace (if in grace period)
`ws_register`	Manage symbolic links to workspaces

All six commands have help messages accessible by COMMAND -h and man pages accessible by man COMMAND.

Note

None of the workspace commands except for ws_list and ws_register work inside user namespaces such as created by running containers of any sort (e.g. Apptainer) or manually with unshare. This includes the JupyterHPC service and HPC desktops.

Workspaces are created with the requested expiration time (each data store has a maximum allowed value). The default expiration time if none is requested is 1 day. Workspaces can have their expiration extended a limited number of times. After a workspace expires, it is released. Released workspaces can be restored for a limited grace period, after which the data is permanently deleted. Note that released workspaces still count against your filesystem quota during the grace period. All workspace tools use the -F <name> option to control which data store to operate on, where the default depends on the kind of project. The various limits, default data store for each kind of project, as well as which cluster islands each data store is meant to be used from and their purpose/specialty are:

Name	Default	Islands	Purpose/Specialty	Time Limit	Extensions	Grace Period
`ceph-ssd`	NHR SCC REACT	all	all-rounder	30 days	2 (90 days max lifetime)	30 days
`ceph-hdd`		all	large size	60 days	5 (360 days max lifetime)	30 days
`lustre-mdc`		Emmy P2 Emmy P1	High bandwidth for medium and big files	30 days	2 (90 days max lifetime)	30 days
`lustre-rzg`		Grete Emmy P3	High bandwidth for medium and big files	30 days	2 (90 days max lifetime)	30 days

Note

Only workspaces on data stores mounted on a particular node are visible and can be managed (allocate, release, list, etc.). If the data store that is the default for your kind of project is not available on a particular node, the special DONT_USE data store will be the default that doesn’t support allocation (you must then specify -F <name> in all cases). See Cluster Storage Map for more information on which filesystems are available where.

Managing Workspaces

Allocating

Workspaces are created via

ws_allocate [OPTIONS] WORKSPACE_NAME DURATION

The duration is given in days and workspace names are limited to ASCII letters, numbers, dashes, dots, and underscores. The most important options (run ws_allocate -h to see the full list) are:

  -F [ --filesystem ] arg   filesystem
  -r [ --reminder ] arg     reminder to be sent n days before expiration
  -m [ --mailaddress ] arg  mailaddress to send reminder to
  -g [ --group ]            group workspace
  -G [ --groupname ] arg    groupname
  -c [ --comment ] arg      comment

Use --reminder <days> --mailaddress <email> to be emailed a reminder the specified number of days before the workspace expires. Use --group --groupname <group> to make the workspace readable and writable by the members of the specified group, however this only works for members of the group that are also members of the same project. Members of other projects (than the username you used to create the workspace) cannot access it, even if you have a common POSIX group and use the group option. Thus, usually the only value that makes sense is the group HPC_<project>, which can be conveniently generated via "HPC_${PROJECT_NAME}" in the shell. If you run ws_allocate for a workspace that already exists, it just prints its path to stdout, which can be used if you forgot the path (you can also use ws_list).

Note

Workspace names and their paths are not private. Any user on the cluster can see which workspaces exist and who created them. However, other usernames cannot access workspaces unless the workspace was created with --group --groupname <group> and they are both a member of the same project and of that group.

To create a workspace named MayData on ceph-ssd with a lifetime of 6 days which emails a reminder 2 days before expiration, you could run:

~ $ ws_allocate -F ceph-ssd -r 2 -m myemail@example.com MayData 6
Info: creating workspace.
/mnt/ceph-ssd/workspaces/ws/nhr_internal_ws_test/u17588-MayData
remaining extensions  : 2
remaining time in days: 6

The path to the workspace is printed to stdout while additional information is printed to stderr. This makes it easy to get the path and save it as an environment variable:

~ $ WS_PATH=$(ws_allocate -F ceph-ssd -r 2 -m myemail@example.com MayData 6)
Info: creating workspace.
remaining extensions  : 2
remaining time in days: 6
~ $ echo $WS_PATH
/mnt/ceph-ssd/workspaces/ws/nhr_internal_ws_test/u17588-MayData

You can set defaults for the duration as well as the --reminder, --mailaddress, and --groupname options by creating a YAML config file at $HOME/.ws_user.conf formatted like this:

duration: 15
groupname: HPC_foo
reminder: 3
mail: myemail@example.com

Listing

Use the ws_list command to list your workspaces, the workspaces made available to you by other users in your project, and the available datastores. Use the -l option to see the available data stores for your username on the particular node you are currently using:

~ $ ws_list -l
available filesystems:
ceph-ssd
lustre-rzg
DONT_USE (default)

Note that the special unusable location DONT_USE is always listed as the default even if the default for the kind of project your username is in is available.

Running ws_list by itself lists your workspaces that can be accessed from the node (not all data stores are available on all nodes). Add the -g option to additionally list the ones made available to you by other users. If you run ws_list -g after creating the workspace in the previous example, you would get:

~ $ ws_list -g
id: MayData
     workspace directory  : /mnt/ceph-ssd/workspaces/ws/nhr_internal_ws_test/u17588-MayData
     remaining time       : 5 days 23 hours
     creation time        : Thu Jun  5 15:01:14 2025
     expiration date      : Wed Jun 25 15:01:14 2025
     filesystem name      : ceph-ssd
     available extensions : 2

Extending

The expiration time of a workspace can be extended with ws_extend up to the maximum time allowed for an allocation on the chosen data store. It is even possible to reduce the expiration time by requesting a value lower than the remaining duration. The number of times a workspace on a particular data store can be extended is also limited, to two times on our cluster. Workspaces are extended by running:

ws_extend -F DATA_STORE WORKSPACE_NAME DURATION

Don’t forget to specify the data store with -F <data-store>. For example, to extend the workspace allocated in the previous example to 20 days, run:

~ $ ws_extend -F ceph-ssd MayData 20
Info: extending workspace.
Info: reused mail address example@example.com
/mnt/ceph-ssd/workspaces/ws/nhr_internal_ws_test/u17588-MayData
remaining extensions  : 1
remaining time in days: 20

Releasing

A workspace can be released before its expiration time by running:

ws_release -F DATA_STORE [OPTIONS] WORKSPACE_NAME

The most important option here is --delete-data which causes the workspace’s data to be deleted immediately (remember, the data stores for workspaces have NEITHER backups NOR snapshots, so the data is lost forever). Otherwise, the workspace will be set aside and remain restorable for the duration of the grace period of the respective data store.

Note

Workspaces released without the --delete-data option still count against your project’s quota until the grace period is over and they are automatically cleaned up.

Restoring

A released or expired workspace can be restored within the grace period using the ws_restore command. Use ws_restore -l to list your restorable workspaces and to get their full IDs. If the previously created example workspace was released, you would get:

~ $ ws_restore -l
ceph-ssd:
u17588-MayData-1749129222
        unavailable since Thu Jun  5 15:13:42 2025
lustre-rzg:
DONT_USE:

Note that the full ID of a restorable workspace includes your username, the workspace name, and the unix timestamp from when it was released. In order to restore a workspace, you must first have another workspace available on the same data store to restore it into. Then, you would call the command like this:

ws_restore -F DATA_STORE WORKSPACE_ID_TO_RESTORE DESTINATION_WORKSPACE

and it will ask you to type back a set of randomly generated characters before restoring (restoration is interactive and is not meant to be scripted). The workspace being restored is placed as a subdirectory in the destination workspace with its ID. Using the previous example, one could create a new workspace MayDataRestored and restore the workspace to it via:

~ $ WS_DIR=$(ws_allocate -F ceph-ssd MayDataRestored 30)
Info: creating workspace.
remaining extensions  : 2
remaining time in days: 30
~ $ ws_restore -F ceph-ssd u17588-MayData-1749129222 MayDataRestored
to verify that you are human, please type 'tafutewisu': tafutewisu
you are human
Info: restore successful, database entry removed.
~ $ ls $WS_DIR
u17588-MayData-1749129222

Managing Links

Keeping track of the paths to each workspace can be difficult sometimes. You can use the ws_register DIR command to setup symbolic links (symlinks) to all of your workspaces in the directory DIR. After doing that, each of your workspaces has a symlink <dir>/<datastore>/<username>-<workspacename>.

~ $ mkdir ws-links
~ $ ws_register ws-links
keeping link  ws-links/ceph-ssd/u17588-MayDataRestored
~ $ ls -lh ws-links/
total 0
drwxr-x--- 2 u17588 GWDG 4.0K Jun  5 15:47 DONT_USE
drwxr-x--- 2 u17588 GWDG 4.0K Jun  5 15:47 ceph-ssd
drwxr-x--- 2 u17588 GWDG 4.0K Jun  5 15:47 lustre-rzg
~ $ ls -lh ws-links/*
ws-links/DONT_USE:
total 0

ws-links/ceph-ssd:
total 0
lrwxrwxrwx 1 u17588 GWDG 71 Jun  5 15:47 u17588-MayDataRestored -> /mnt/ceph-ssd/workspaces/ws/nhr_internal_ws_test/u17588-MayDataRestored

ws-links/lustre-rzg:
total 0

Temporary Storage

These can be used to reduce the IO load on slower data stores and therefore improve job performance and reduce the impact on other users. A common workflow is

Copy files that must be read multiple times, particularly in random order, into the temporary storage.
Do computations, keeping the most intense operations in the temporary storage (e.g. output files that have to be written over many times or in a random order).
Copy the output files that need to be kept from the temporary storage to some data store.

In this workflow, the temporary storage is used as a staging area for the high intensity IO operations while the other storage locations get low intensity ones (e.g. reading a file beginning to end once in large chunks).

Info

For shared temporary storage that should last longer than a job (i.e. for several jobs or longer), please use workspaces.

Temporary Storage in a Job

Each batch job has several temporary storages available, which are listed in the table below along with their characteristics. A temporary directory is created in each available one whose path is put into an environmental variable for convenient use. These directories are cleaned up (deleted) when the job ends (no need to manually clean up).

Kind	Shared	Performance	Capacity	Environmental Variable
Shared Memory (RAM)	local	max	tiny	`SHM_TMPDIR`
Local SSD (not all nodes)	local	high	small	`LOCAL_TMPDIR`
SSD SCRATCH/WORK	global	medium-high	medium	`SHARED_SSD_TMPDIR`
HDD SCRATCH/WORK	global	medium	large	`SHARED_TMPDIR`

Info

SHARED_SSD_TMPDIR and SHARED_TMPDIR are always created on the fastest SCRATCH/WORK filesystem available to the particular node (see Cluster Storage Map for which one that is), regardless of whether a user or their project would normally have a permanent data store or access to workspaces on it. For example, an SCC user’s job running on an Emmy Phase 3 node would have its temporary directories be created on SCRATCH RZG (“scratch-grete”) even though SCC users and projects do not get a permanent data store on SCRATCH RZG.

Note

The local SSD temporary storage is only available on nodes that have an SSD built in. Most nodes have at least one SSD, but a few don’t. To ensure that only nodes with an internal SSD are allocated to your job, add the constraint -C ssd in your srun command line or sbatch headers. Check CPU Partitions and GPU Partitions to see which partitions have nodes with SSDs and how many.

The environmental variable TMPDIR will be assigned to one of these. You can override that decision by setting it yourself to the value of one of them. For example, to use the local SSD, you would run the following in Bash:

export TMPDIR=$LOCAL_TMPDIR

The local temporary stores have the best performance because all operations stay within the node and don’t have to go over the network. But that inherently means that the other nodes can’t access them. The global temporary stores are accessible by all nodes in the same job, but this comes at the expense of performance. It is possible to use more than one of these in the same job (e.g. use local ones for the highest IO intensity files local to the node and a global one for things that must be shared between nodes). We recommend that you choose the local ones when you can (files that don’t have to be shared between nodes and aren’t too large).

Shared Memory

The fastest temporary storage is local shared memory (stored under /dev/shm) which is a filesystem in RAM. It has the smallest latency and bandwidth can exceed 10 GiB/s in many cases. But it is also the smallest.

Note

The /dev/shm filesystem is configured to only grow to a portion of a node’s total physical memory, to avoid problems when the system runs out of memory. Currently it is set at 50% of RAM, which you can check yourself from within a job with free (memory) and df (space allocated to shm):

u12345@c0005 ~ $ df -h | grep shm
Filesystem    Size  Used Avail Use% Mounted on
tmpfs         126G  116M  126G   1% /dev/shm
u12345@c0005 ~ $ free -h
              total        used        free      shared  buff/cache   available
Mem:          251Gi       5,7Gi       241Gi       3,6Gi       3,9Gi       240Gi

The size of the files you create in it count against the memory requested by your job. So make sure to request enough memory in your job for your shared memory utilization plus the RAM used by your programs and taking into account the factor between RAM and shm. If you are experiencing problems with moving files to $SHM_TMPDIR, you can try increasing the requested RAM, requesting more nodes for more total RAM, requesting your job be exclusive (so the node’s RAM is not shared with other jobs), and checking the contents of /dev/shm (there might sometimes be leftover files from previous jobs). df -h /dev/shm should show a small (single-digit) number for the column Use%.

Optimizing IO Performance

It is important to remember that the data stores are shared by with other users and bad IO patterns can hurt not only the performance of your jobs but also that of other users. A general recommendation for distributed network filesystems is to keep the number of file metadata operations (opening, closing, stat-ing, truncating, etc.) and checks for file existence or changes as low as possible. These operations often become a bottleneck for the IO of your job, and if bad enough can reduce the performance for other users. For example, if jobs request hundreds of thousands metadata operations like open, close, and stat; this can cause a “slow” filesystem (unresponsiveness) for everyone even when the metadata is stored on SSDs.

Therefore, we provide here some general advice to avoid making the data stores unresponsiveness:

Use the local temporary storage of the nodes when possible in jobs (see Temporary Storage for more information).
Write intermediate results and checkpoints as seldom as possible.
Try to write/read larger data volumes (>1 MiB) and reduce the number of files concurrently open.
For inter-process communication use proper protocols (e.g. MPI) instead of using files to communicate between processes.
If you want to control your jobs externally, consider using POSIX signals instead of using files frequently opened/read/closed by your program. You can send signals to jobs by scancel --signal=SIGNALNAME JOBID
Use MPI-IO to coordinate your I/O instead of each MPI task doing individual POSIX I/O (HDF5 and netCDF may help you with this).
Instead of using resursive chmod/chown/chgrp, please use as combination of find (note that Lustre has its own optimized lfs find) and xargs. For example, lfs find /path/to/folder | xargs chgrp PROJECTGROUP creates less stress than chgrp -R PROJECTGROUP /path/to/folder and is much faster.

Analysis of Metadata Operations

An existing application can be investigated with respect to metadata operations. Let us assume an example job script for the parallel application myexample.bin with 16 MPI tasks.

#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=8
#SBATCH --time=01:00:00
#SBATCH --partition=standard96

srun ./myexample.bin

The linux command strace can be used to trace IO operations by prepending it to the call to run another program. Then, strace traces that program and creates two files per process (MPI task) with the results. For this example, 32 trace files are created. Large MPI jobs can create a huge number of trace files, e.g. a 128 node job with 128 x 96 MPI tasks created 24576 files. That is why we strongly recommend to reduce the MPI tasks when doing performance analysis as far as possible.

#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=8
#SBATCH --time=01:00:00
#SBATCH --partition=standard96

srun strace -ff -t -o trace -e open,openat ./myexample.bin

Analysing one trace file shows all file open activity of one process (MPI task).

> ls -l trace.*
-rw-r----- 1 bzfbml bzfbml 21741 Mar 10 13:10 trace.445215
...
> wc -l trace.445215
258 trace.445215
> cat trace.445215
13:10:37 open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
13:10:37 open("/lib64/libfabric.so.1", O_RDONLY|O_CLOEXEC) = 3
...
13:10:38 open("/scratch/usr/bzfbml/mpiio_zxyblock.dat", O_RDWR) = 8

For the interpretation of the trace file you need to differentiate between the calls originating from your code and the ones that are independent of it (e.g. every shared library your code uses and their shared libraries and so on has to be opened at least once). The example code myexample.bin creates only one file with the name mpiio_zxyblock.dat. 258 open statements in the trace file include only one open from the application which indicates a very desirable meta data activity.

Known Issues

Some codes have well known issues:

OpenFOAM: always set runTimeModifiable false and fileHandler collated with a sensible value for purgeWrite and writeInterval (see the OpenFOAM page)
NAMD: special considerations during replica-exchange runs (see the NAMD page)

If you have questions or you are unsure regarding your individual scenario, please get in contact with your consultant or start a support request.

Filesystem Specific Tips

Lustre

Some good best practices for using the Lustre SCRATCH/WORK data stores can be found at https://www.nas.nasa.gov/hecc/support/kb/lustre-best-practices_226.html

Technical Description

Each cluster provides several different storage systems that can be placed into the following rough categories:

Category	Speed	Size	Durability	Size Project¹	Size User¹	Purpose
HOME	medium	medium	snapshots, backups		~ 40 GiB	User HOMEs
Project	medium	medium	snapshots, backups	~ 3 TiB		Medium term data
Workspaces	medium to fast	big	RAID only	~ 10 / 40 TiB		Active data
SCRATCH/WORK (²)	fast	big	RAID only	~ 2 / 12 TiB³	~ 1 / 2 TB³	Active data
COLD	medium	big	snapshots	~ 10 TiB		Medium term data
ARCHIVE/PERM	very slow	medium	backups	~ 8 TiB		Archival to tape

[1]: These numbers are generalized figures and only show the soft limits.
[2]: Being phased out in favor of Workspaces. See SCRATCH/WORK for the phase out schedule.
[3]: These are the numbers for the SSD scratch and the HDD scratch, you can find more information here.

These are shared parallel filesystems accessible from many nodes, rather than node local storage (which the nodes also have). These filesystems also have quotas on the use of storage space (block quota) and inodes (every file and directory has an entry in the directory tree using one or more inodes).

Note

It is important to use the right storage system for the right purpose to get the best performance, have the fewest surprises, and cause the least impact on other users. For example, a job that has to do many IO operations on temporary data could perform terribly if those temporary files (e.g. opening a file, writing a small amount of data, closing the file, deleting the file repeatedly millions of times) were stored on HOME or in a SCRATCH/WORK filesystem, and could easily slow down the filesystem for everyone on the cluster.

See the pages below (also can be found in the nav bar to the left) for information about the different parts of the storage systems and how to use them.

Data Stores

General Tips

Use the right data store for the right job. Each one has a different purpose with different engineering tradeoffs (e.g. performance vs. robustness), capacities, etc. Using a workspace, SCRATCH/WORK data store, or temporary storage as a staging area to do a lot of IO (copy data into it, do operations, copy/move results to a different data store) can often greatly improve performance.

It is important to remember that the data stores are shared with other users and bad IO patterns by a single user can hurt the performance for everyone. A general recommendation for distributed network filesystems is to keep the number of file metadata operations (opening, closing, stat-ing, truncating, etc.) and checks for file existence or changes as low as possible. These operations often become a bottleneck for the IO of your job, and if bad enough can reduce the performance for other users. For example, if jobs request hundreds of thousands metadata operations like open, close, and stat per job; this can cause a “slow” filesystem (unresponsiveness) for everyone even when the metadata is stored on SSDs. See the Optimizing Performance page for more information on how to get good performance on the data stores.

Data Lifetime after Project/User Expiration

In general, we store all data for an extra year after the end of a project or user account. If not extended, the standard term of a project is 1 year. Workspaces have their own individual lifetimes set when requesting and/or extending one. The standard term for a user account is the lifetime of the project it is a member in (lifetime of the last project for legacy NHR/HLRN accounts). Note that migrating a legacy NHR/HLRN project removes all legacy NHR/HLRN users from it (see the NHR/HLRN Project Migration for more information).

Project Data Stores

Every project gets one or more data stores to place data depending on the kind of project. In some cases, there will be more than one directory in the data store; but all share the same quota (see the Quota page for more information).

All projects in the HPC Project Portal have a Project Map with convenient symlinks to all the project’s data stores. The project-specific usernames of these projects have a symlink ~/.project to this directory. See the Project Map and Project Management pages for more information.

Projects get the categories of data stores in the table below, which are then described further in the subsections for each category.

Data Store	SCC	NHR	KISSKI	REACT
Project	yes	yes	yes	yes
Workspaces	yes	yes		yes
SCRATCH/WORK (phasing out)	yes	yes
COLD		yes
ARCHIVE/PERM		yes

Warning

Permanent SCRATCH/WORK directories are being phased out in favor of workspaces. See the top of the SCRATCH/WORK page for their phase out schedule.

User Data Stores

Every user gets a HOME directory and potentially additional data stores to place configuration, data, etc. depending on the kind of user and the project they are a member of. Users get the categories of data stores in the table below, which are then described further in the subsections for each category.

Data Store	SCC	NHR	KISSKI	REACT
HOME	yes	yes	yes	yes
SCRATCH/WORK (phasing out)	yes	yes
ARCHIVE/PERM	legacy only	legacy only

Warning

Permanent SCRATCH/WORK directories are being phased out in favor of workspaces. See the top of the SCRATCH/WORK page for their phase out schedule.

Data Store Categories

Each category is discussed in its own page with links listed below.

HOME

Every user gets a HOME directory. Its path is always stored in the environmental variable $HOME and most shells will expand ~ into it. The directory will be HOME_BASE/ACADEMIC_CLOUD_USERNAME/USER for members of projects in the HPC Project Portal and HOME_BASE/USER otherwise (e.g. legacy HLRN accounts), where HOME_BASE is one of the base directories for home directories (there is more than one and they may change in the future). The base directory shown by POSIX commands (like getent passwd) or in the environment variables can be a symbolic link (e.g. /user), which points to the actual base directory on the data store (e.g. /mnt/vast-nhr/home), or another link farm (/mnt/vast-orga) which only contains symlinks to the real home directories.

The HOME directory is meant for a user’s:

configuration files
source code
self-built software

The HOME storage systems have the following characteristics:

Optimized for a high number of files rather than capacity
Optimized for robustness rather than performance
Has limited disk space per user
Is regularly backed up to tape (most also have snapshots, see below)
Has a quota

The HOME filesystems have slow-medium to medium performance. The HOME directories for each kind of user are given in the table below.

Kind of User	Media	Capacity	Filesystem
NHR	SSD	1.15 PiB (shared)(comp)(dedup)	VAST exported via NFS
SCC (Project Portal users)	SSD	1.15 PiB (shared)(comp)(dedup)	VAST exported via NFS
SCC (legacy)	HDD	10.5 PiB	Stornext exported directly and via NFS
KISSKI	SSD	1.15 PiB (shared)(comp)(dedup)	VAST exported via NFS
REACT	SSD	1.15 PiB (shared)(comp)(dedup)	VAST exported via NFS

Info

Legend for the tags in the Capcity column:

(shared): They share capacity with other data stores. For example, NHR HOME and Project data stores are on the same storage system.

(comp): Use live compression to increase effective capacity, though this comes at the expense of some CPU time to compress and decompress.

(dedup): Use deduplication to increase effective capacity.

Warning

Legacy SCC users have their HOME directories on the GWDG Unix HOME filesystem, which is only accessible from SCC Legacy nodes. See Cluster Storage Map for more information.

Snapshots

If you accidentally deleted or overwrote something in your home directory and want to restore an earlier version of the affected files, it is not always necessary to write a ticket. Most of our home filesystems save regular snapshots that are kept for a short period (between a few days up to 3-4 weeks). These snapshots can be accessed for any directory by entering the hidden .snapshot directory by cd .snapshot.

Info

The .snapshot directory is so hidden that it does not even show up with ls -a, and thus autocomplete on the command line does not work for it.

Project

The Project directories of projects are for

configuration files
source code
self-built software
medium term data storage

The Project storage systems have the following characteristics:

Optimized for a high number of files rather than capacity
Optimized for robustness rather than performance
Backed up and/or has snapshots
Has a quota

The Project filesystems have slow-medium to medium performance. The directory’s symlink in the Project Map directory for each project has the name dir.project. The Project directories for each kind of project are given in the table below.

Kind of Project	Path	Media	Capacity	Filesystem
SCC	`/mnt/ceph-hdd/projects/PROJECT`	HDD with metadata on SSD	21 PiB (shared)	CephFS
NHR	`/mnt/vast-nhr/projects/PROJECT`	SSD	1.15 PiB (shared)(comp)(dedup)	VAST exported via NFS
KISSKI	`/mnt/vast-kisski/projects/PROJECT`	SSD	1.15 PiB (shared)(comp)(dedup)	VAST exported via NFS
REACT	`/mnt/vast-react/projects/PROJECT`	SSD	1.15 PiB (shared)(comp)(dedup)	VAST exported via NFS

Info

Legend for the tags in the Capcity column:

(shared): They share capacity with other data stores. For example, NHR Project and HOME data stores are on the same storage system.

(comp): Use live compression to increase effective capacity, though this comes at the expense of some CPU time to compress and decompress.

(dedup): Use deduplication to increase effective capacity.

SCRATCH/WORK

Warning

Permanent long-lived SCRATCH/WORK directories are being phased out in favor of dynamically created workspaces with limited lifetimes. Their phase out schedule is:

Data Store	Permanent directories no longer issued
SCRATCH SCC	1st of October 2025
SCRATCH RZG	1st of October 2025

After this point on SCRATCH RZG (formerly “SCRATCH Grete”), only workspaces will continue be available (each workspace directory has a limited lifetime).

The SCRATCH SCC filesystem will be retired completely:

Data Store	Unmounted from compute nodes	Shut off
SCRATCH SCC	1st of January 2026	2nd of February 2026

Info

The old SCRATCH EMMY filesystems available at /scratch-emmy, /mnt/lustre-emmy-hdd and /mnt/lustre-emmy-ssd (as well as the symlink /scratch on the Emmy P2 nodes) was already retired and should no longer be used in compute jobs.

SCRATCH/WORK data stores are meant for active data and are configured for high performance at the expense of robustness. The characteristics of the SCRATCH/WORK data stores are:

Optimized for performance from the sub-clusters located in the same building
Optimized for high input/output bandwidth from many nodes and jobs at the same time
Optimized for a moderate number of medium to large files
Meant for active data (heavily used data with a relatively short lifetime)
Has a quota
Has NEITHER backups nor snapshots

Warning

The SCRATCH filesystems have NO BACKUPS. Their performance comes at the price of robustness, meaning they are more fragile than other systems. This means there is a non-negligible risk of data on them being completely lost if more than a few components/drives in the underlying storage fail at the same time.

There is one SCRATCH/WORK data store for the SCC and three for NHR which are shown in the table below and detailed by Project/User kind in separate subsections.

Project/User Kind	Name	Media	Capacity	Filesystem
SCC	SCRATCH SCC	HDD with metadata on SSD	2.1 PiB	BeeGFS
	SCRATCH RZG (formerly “SCRATCH Grete”)	SSD	509 TiB	Lustre

SCC

Projects get a SCRATCH SCC directory at /scratch-scc/projects/PROJECT, which has a Project Map symlink with the name dir.scratch-scc. Users get a SCRATCH SCC directory at /scratch-scc/users/USER.

NHR

Each project gets 0-2 directories in each SCRATCH/WORK data store which are listed in the table below. New projects in the HPC Project Portal get the directories marked “new”. Legacy NHR/HLRN projects started before 2024/Q2 get the directories marked “legacy”. Legacy NHR/HLRN projects that have been migrated to the HPC Project Portal keep the directories marked “legacy” and get the directories marked “new” (they get both). See NHR/HLRN Project Migration for more information on migration.

Project Data Store	Pathes	Project Map symlink
SCRATCH RZG	`/mnt/lustre-grete/projects/PROJECT` (new) `/scratch-grete/projects/PROJECT` (legacy)	`dir.lustre-grete` (new) `dir.scratch-grete` (legacy)

Users get two directories in each SCRATCH/WORK data store, except for legacy NHR/HLRN users which do not get them in the SCRATCH MDC (SSD) data store. They take the form SCRATCH/SUBDIR/USER, with SCRATCH/usr/USER being for the user’s files and SCRATCH/tmp/USER for temporary files (see Temporary Storage for more information). The directories in each data store are listed in the table below. Members of projects in the HPC Project Portal get the directories marked “new”. Legacy NHR/HLRN users get the directories marked “legacy”.

User Data Store	Path
SCRATCH RZG	`/mnt/lustre-grete/SUBDIR/USER` (new) `/scratch-grete/SUBDIR/USER` (legacy)

One of the most important things to keep in mind is that the NHR cluster itself is split between two computing centers. While they are physically close to each other, the inter-center latency is higher (speed of light issues plus more network hops) and the inter-center bandwidth lower (less fibers) than intra-center connections. Each computing center has its own SCRATCH/WORK data store/s to provide maximum local performance in the computing center. The best performance is gotten using the SCRATCH/WORK data store/s in the same computing center, particularly for IOPS.

The two centers are the MDC (Modular Data Center) and the RZG (Rechenzentrum Göttingen). The name of the computing center is in the name of the data store (e.g. “SCRATCH RZG” is at RZG). The sites for each sub-cluster are listed in the table below along which data store the symlink /scratch points to.

Sub-cluster	Site (Computing Center)	Target of `/scratch` symlink	Symlink Target
Emmy Phase 3	RZG	SCRATCH RZG	`/scratch-grete`
Grete Phase 1	RZG	SCRATCH RZG	`/scratch-grete`
Grete Phase 2	RZG	SCRATCH RZG	`/scratch-grete`
Grete Phase 3	RZG	SCRATCH RZG	`/scratch-grete`

See Cluster Storage Map for more information.

Info

SCRATCH RZG used to be known as “SCRATCH Grete” because it used to be that all of Emmy was in the MDC and all of Grete was in the RZG, which is no longer the case. This historical legacy can still be seen in the names of the mount point. The new Lustre filesystem in the MDC has carried the name lustre-mdc from the start (August 2025).

COLD

COLD data stores are meant for medium term data storage (months to a few years) and are configured for robustness rather than performance. The characteristics of COLD data stores are

Optimized for high capacity
Optimized for robustness rather than performance
Has a quota
Has snapshots
Has NO backups

Info

Snapshots can be used to restore accidentally deleted or corrupted data for a certain time window. Unlike backups, they do not protect against the loss of the entire filesystem.

The COLD filesystems have medium performance. The COLD data stores for each kind of project are:

Kind of Project	Name	Media	Capacity	Filesystem
NHR	Ceph HDD Cold	HDD with metadata on SSD	21 PiB (shared)	CephFS

Info

Legend for the tags in the Capacity column:

(shared): They share capacity with other data stores. For example, NHR COLD and SCC Project data stores are on the same storage system.

The paths for the data store directories and their symlinks in the Project Map for each project are:

Project Data Store	Path	Project Map symlink
Ceph HDD Cold	`/mnt/ceph-hdd/cold/PROJECT`	`dir.ceph-hdd`

ARCHIVE/PERM

The magnetic tape archive provides additional storage for inactive data for short-term archival and to free up space in the Project, HOME, and SCRATCH/WORK data stores. It is only accessible from the login nodes. Its capacity grows as more tapes are added. Its characteristics are

Secure file system location on magnetic tapes
Extremly high latency per IO operation, especially for reading data not in the HDD cache (minutes to open a file)
Optimized for a small number of large files
Short-term archival
Has a quota

Warning

ARCHIVE/PERM are SHORT-TERM archives, NOT long-term archives. Thus,

It is not a solution for long-term data archiving.
There is no guarantee for 10 years according to rules for good scientific practice.

For reasons of efficiency and performance, small files and/or complex directory structures should not be transferred to the archive directly. Please aggregate your data into compressed tarballs or other archive containers with a maximum size of 5.5 TiB before copying your data to the archive. For large data, a good target size is 1-2 TiB per file because such files will usually not be split across more than one tape.

The project ARCHIVE/PERM data stores for each kind of project are given in the table below. The directory’s symlink in the Project Map directory for each project has the name dir.perm.

Kind of Project	Path	Media	Capacity	Filesystem
NHR	`/perm/projects/PROJECT`	tape with HDD cache	growable PiBs	Stornext exported via NFS

The user ARCHIVE/PERM data stores for each kind of user are given in the table below.

Kind of User	Path	Media	Capacity	Filesystem
SCC (legacy only)	`/usr/users/a/USER`	tape with HDD cache	growable PiBs	Stornext exported via NFS
NHR (legacy only)	`/perm/USER`	tape with HDD cache	growable PiBs	Stornext exported via NFS

Warning

ARCHIVE/PERM directories are only accessible on the login nodes, but each group of login nodes has access to only one of them. See Cluster Storage Map for more information.

Ceph

The CephFS based systems provide the volume storage for NHR and SCC users. It is mounted at /mnt/ceph-hdd (Ceph HDD) and /mnt/ceph-ssd (Ceph SSD).

Ceph is connected to Emmy P2, Emmy P3, and Grete with 200 Gbit/s of aggregate bandwidth each, so it can be used to transfer larger amounts of data between compute islands. Individual transfers are of course limited by load of other users and individual network interfaces and will typically be much slower.

The Ceph HDD system stores data on harddrives and metadata on SSDs and has a total capacity of 21 PiB. Data that is no longer actively used by compute jobs should be stored here. It is available to SCC and NHR projects by default, and NHR test accounts can request storage via the Workspaces system.

The Ceph SSD system uses only SSDs and has a total capacity of 606 TiB. Storage on this system can be requested via the Workspaces system.

Ceph should not be used for heavy parallel I/O from compute jobs, storage systems integrated into the compute islands like the Lustre systems are usually more suitable for this purpose. The exception are workloads that need to access storage from multiple compute islands (e.g. job chains that run on both phases of Emmy or on both Emmy and Grete), here the Ceph SSD system can be used.

Lustre

There are two different Lustre filesystems corresponding to our data center locations to provide maximum local performance.

While they are physically close to each other, connections between the buildings have a higher latency and lower throughput.

The best performance is achieved using the Lustre filesystem in the same building, particularly for IOPS.

The Lustre filesystems are optimized for high input/output bandwidth from many nodes at the same time and a moderate number of large files, i.e. hot data that is actively used by compute jobs.

NHR project members can request Lustre storage via the Workspaces system or, in special cases, as Requestable Storage.

Lustre MDC

Located in the mobile data center and connected to Emmy Phase 2. The total capacity is 1.6 PiB. It is mounted at /mnt/lustre-mdc on these nodes, with a symlink from /scratch.

Lustre RZG

Located in the RZGö and connected to Grete and Emmy Phase 3. The total capacity is 509 TiB. It is mounted at /scratch-grete and /mnt/lustre-grete on these nodes, with a symlink from /scratch to /scratch-grete.

Info

Keep in mind that /scratch points to different filesystems on different phases of emmy. You can use the Ceph SSD filesystem for files that need to be accessed with good performance from all systems.

Performance

The best performance can be reached with sequential IO of large files that is aligned to the fullstripe size of the underlying RAID6 (1 MiB).

If you are accessing a large file (1+ GiB) from multiple nodes in parallel, please consider setting the striping of the file with the Lustre command lfs setstripe with a sensible stripe-count (recommend up to 32) and a stripe-size which is a multiple of the RAID6 fullstripe size (1 MiB) and matches the IO sizes of your job.

This can be done to a specific file or for a whole directory. But changes apply only for new files, so applying a new striping to an existing file requires a file copy.

An example of setting the stripe size and count is given below (run man lfs-setstripe for more information about the command).

lfs setstripe --stripe-size 1M --stripe-count 16 PATH

Tape Archive

The tape archive is available for NHR users. It is mounted at /mnt/snfs-hpc-hsm with a symlink from /perm.

The tape archive is using the StorNext Shared Storage File System from Quantum. It has a HDD cache (which is the capacity displayed by df), and the data from the cache is then written to a large library of magnetic tapes that can grow to multiple PiB.

This data store is very reliable and can be used as a short-term archive over the lifetime of your project. For long term archival (10 years) of scientific results (e.g. to follow the rules for good scientific practice) please download the data to your home institution instead. Opening a file can take several minutes because of the very high latency of a tape library.

Usage

Data Stores

PERM storage for NHR projects

Vast

The VAST Data storage system is a NVME based all-flash storage system best suited for read intensive workloads, which require consistent low latencies (e.g. AI training). This storage system can be accessed with RDMA and GPU Direct Storage (cuFile API) from the GPU nodes of Grete and is also available on all other compute islands.

The VAST system is backing our central software installation, all home directories and the $PROJECT directories for NHR and KISSKI. It is also used for data pools and our AI services.

The total capacity is 1.9 PiB.

Project Map

All projects managed in the HPC Project Portal have a special directory under /projects containing symlinks to all the project’s data stores and a subdirectory for each sub-project. Specifically, the whole project tree (projects and their sub-projects) is represented as sub-directories under /projects matching the Project Structure.

In the HOME directory of every project-specific username in a project, there is the symlink ~/.project to the map directory for the project. In this directory, the symlinks to the project’s directories in each data store take the form dir.DATASTORENAME while the directories for sub-projects match the HPC Project IDs of the sub-projects. The project map directories are read-only and thus cannot be used to store data. They are provided as a convenience for accessing project data stores quickly with paths like ~/.project/dir.lustre-emmy-hdd.

For example, the project map directory for the intern/gwdg_training project is

[foo@glogin11 ~]$ ls -l /projects/intern/gwdg_training/ \
> | sed -e 's/  */ /g' | cut -d ' ' -f 1,9-
total
drwxr-xr-x 20240423-gpu-programming
drwxr-xr-x 20240522-snakemake-hpc
drwxr-xr-x 20240528-debugging-openfoam
drwxr-xr-x 20240606-data-management-hpc
drwxr-xr-x 20240612-hpda-p2
drwxr-xr-x 20240617-using-the-scc
drwxr-xr-x 20240618-ansys-on-cluster
drwxr-xr-x 20240702_qc
drwxr-xr-x 20240708-getting-linux-bash
drwxr-xr-x academy_containers_20240530
drwxr-xr-x academy_dummy_111111
lrwxrwxrwx dir.lustre-emmy-hdd -> /mnt/lustre-emmy-hdd/projects/gwdg_training
lrwxrwxrwx dir.lustre-emmy-ssd -> /mnt/lustre-emmy-ssd/projects/gwdg_training
lrwxrwxrwx dir.lustre-grete -> /mnt/lustre-grete/projects/gwdg_training
lrwxrwxrwx dir.project -> /home/projects/gwdg_training
drwxr-xr-x perspectives-parallel-io

This shows the directories for the sub-projects for each course and the dir.DATASTORENAME symlinks for each of its data store directories.

Requestable Storage

Storage on various filesystems is available on request if a particular use case does not fit the existing options. For example, some projects require longer term storage on a Lustre filesystem than the Workspaces can provide. Others need more storage on the VAST SSD system for handling large numbers of small files when it is not possible to bundle these into individual files that consume fewer inodes.

In such cases, the requestable data stores can be made available to projects and users via support request, which should include an explanation of the use case and why the existing options are not sufficient.

Data Transfers

We support file transfer tools such as rsync and scp, which use the SSH protocol to establish and encrypt the connection. For this reason, a working SSH connection is a prerequisite for most methods described here. Each of the following sections deals with a specific “direction” for establishing the transfer connection. Independent of this direction, meaning the machine you start the e.g. rsync command from, data can always be transferred from or to the target host.

Data Transfers Connecting from the Outside World

It is highly recommended to specify shorthands for target hosts in your ssh config file, like laid out here. Those shorthands can also be used with rsync (recommended) or scp, making their use much easier and more comfortable.

rsync -av /home/john/data_files Emmy-p3:/mnt/lustre-grete/usr/u12345/

Info

Also see the “Tips and Tricks” section below for a quick description of rsync’s command line arguments/flags.

If necessary, the location of the private key file can also be specified explicitly when calling scp or rsync on the user’s local machine.

Using scp, the option -i <path_to_privatekey> can be added:

scp -i <path_to_privatekey> <user>@glogin.hpc.gwdg.de:<remote_source> <local_target>

scp -i <path_to_privatekey> <local_source> <user>@glogin.hpc.gwdg.de:<remote_target>

With rsync, it is a bit more tricky, using a nested ssh command -e 'ssh -i <path_to_privatekey>' like this:

rsync -av -e 'ssh -i <path_to_privatekey>' <user>@glogin.hpc.gwdg.de:<remote_source> <local_target>

rsync -av -e 'ssh -i <path_to_privatekey>' <local_source> <user>@glogin.hpc.gwdg.de:<remote_target>

<local_source> and <remote_source> can be either single files or entire directories.
<local/remote_target> should be a directory, we recommend always adding a slash / at the end. That avoids accidentally overwriting a file of that name, also works if the target is a symlink and is generally more robust.

For rsync, having a trailing slash or not for the source determines if the directory including its contents or just the contents should be copied.
For scp, if the source is a directory, you have to use the -r switch to recursively copy the directory and its contents.

Data Transfers Connecting to the Outside World

Connections to external machines using the standard port 22 located anywhere in the world can be established interactively from the login nodes. An SSH key pair may or may not be required to connect to external hosts, and additional rules imposed by the external host or institution may apply.

Warning

We do not allow private ssh keys on the cluster! For security reasons, private key files should never leave your local machine!

In order to still be able to use a private key residing on your local machine to establish connections from the cluster to external hosts, you can use an SSH agent. The agent will act as a proxy that forwards requests to access your private key to your local machine and sends back the result.

Here is an example of using an SSH agent:

john@doe-laptop ~ $ eval $(ssh-agent)
Agent pid 345678
john@doe-laptop ~ $ ssh-add ~/.ssh/private_key_for_zib
Identity added: .ssh/private_key_for_zib (john@doe-laptop)
john@doe-laptop ~ $ ssh -A u12345@glogin-p3.hpc.gwdg.de -i ~/.ssh/private_key_for_gwdg
Last login: Thu May  1 11:44:21 2025 from 12.34.56.78
Loading software stack: gwdg-lmod
Found project directory, setting $PROJECT_DIR to '/projects/extern/nhr/nhr_ni/nhr_ni_test/dir.project'
Found scratch directory, setting $WORK to '/mnt/lustre-grete/usr/u12345'
Found scratch directory, setting $TMPDIR to '/mnt/lustre-grete/tmp/u12345'
 __          ________ _      _____ ____  __  __ ______   _______ ____
 \ \        / /  ____| |    / ____/ __ \|  \/  |  ____| |__   __/ __ \
  \ \  /\  / /| |__  | |   | |   | |  | | \  / | |__       | | | |  | |
   \ \/  \/ / |  __| | |   | |   | |  | | |\/| |  __|      | | | |  | |
    \  /\  /  | |____| |___| |___| |__| | |  | | |____     | | | |__| |
  _  \/ _\/  _|______|______\_____\____/|_|  |_|______|____|_|__\____/
 | \ | | |  | |  __ \     ____    / ____\ \        / /  __ \ / ____|
 |  \| | |__| | |__) |   / __ \  | |  __ \ \  /\  / /| |  | | |  __
 | . ` |  __  |  _  /   / / _` | | | |_ | \ \/  \/ / | |  | | | |_ |
 | |\  | |  | | | \ \  | | (_| | | |__| |  \  /\  /  | |__| | |__| |
 |_| \_|_|  |_|_|  \_\  \ \__,_|  \_____|   \/  \/   |_____/ \_____|
                         \____/

 Documentation:  https://docs.hpc.gwdg.de
 Support:        nhr-support@gwdg.de

PARTITION    NODES (BUSY/IDLE)     LOGIN NODES
medium96s          95 /  296     glogin-p3.hpc.gwdg.de
standard96        745 /  245     glogin-p2.hpc.gwdg.de
Your current login node is part of glogin-p3
[nhr_ni_test] u12345@glogin11 ~ $ echo $SSH_AUTH_SOCK
/tmp/ssh-jxWgVZrgW5/agent.2462718
[nhr_ni_test] u12345@glogin11 ~ $ ssh nimjdoe@blogin.nhr.zib.de
Warning: Permanently added 'blogin.nhr.zib.de,130.73.234.2' (ECDSA) to the list of known hosts.

********************************************************************************
*                                                                              *
*               Welcome to NHR@ZIB system "Lise" on node blogin2               *
*               (Rocky Linux 9.5, Environment Modules 5.4.0)                   *
*                                                                              *
*  Manual   ->  https://user.nhr.zib.de                                        *
*  Support  ->  mailto:support@nhr.zib.de                                      *
*                                                                              *
********************************************************************************

Module NHRZIBenv loaded.
Module sw.clx.el9 loaded.
Module slurm (current version 24.11.5) loaded.
blogin2:~ $ exit
[nhr_ni_test] u12345@glogin11 ~ $ rsync -avP data_gwdg nimjdoe@blogin.nhr.zib.de:/scratch/usr/nimjdoe/
[...]

Note

When setting the agent up, it is strongly recommended to add the key with ssh-add -c <path-to-private-key>, if your system supports it. This way, whenever the remote machine needs your key, you will get a confirmation dialog on your local machine asking whether you want to allow it or not.

Without -c, you have no chance of noticing suspicious requests to use your key from the remote machine, although those are highly unlikely.

Some desktop environments are known to have problems (for example some versions of gnome-keyring do not play nice with ssh-agent confirmations), so please try if it works for you and leave out the -c option if it does not. You may have to install additional packages, like ssh-askpass on Ubuntu/Debian-based distributions.

Should you ever get the confirmation dialog at a time you didn’t initiate an SSH connection on the remote machine, someone on the remote machine is trying to use your key. As our admins will never try to steal your key, this probably means the login node or at least your session was compromised by an attacker. Deny the request and please contact our support immediately letting us know about the potentially compromised node (don’t forget to include details like the node name, your username, what exact commands you ran, etc.).

Data transfer in the context of a batch job is restricted due to limited network access of the compute nodes. If possible, avoid connections to the outside world within jobs, otherwise send a message to our support in case you need further help.

Data Transfers within our HPC cluster

You can generally use any login node* to copy or move files between different filesystems, data stores and directories. For larger transfers, it is important to select a login node with a fast connection to both the source and destination data store. Since our nodes are part of different cluster islands, which are located in separate buildings, the connection speed and bandwidth to each data store can differ significantly. Refer to the Cluster Storage Map to get an overview.

We recommend using a terminal multiplexer session to start larger transfers, see the bottom of this page for details.

Note

Please do not start more than two or three larger copy or move operations in parallel, as this can quickly eat up all network bandwidth and make at least the login node you’re using and potentially more nodes using the same network links very slow for all users!

See our Managing Permissions page and Data Migration Guide for details regarding ownership and permissions to access directories across user accounts or from multiple users.

[*]: An exception to this is the legacy SCC home filesystem (Stornext), which is only connected to the legacy SCC login and compute nodes.

Please use those nodes to transfer data between /usr/users///home/uniXX///home/mpgXX/ and the other filesystems.

If you have terabytes of data that need to be transferred, please contact us so we can provide a custom solution.

Data Transfer via ownCloud

It is possible to transfer data between GWDG’s ownCloud service and the HPC systems.

Using rclone with WebDAV provides in general three different methods to access your data on owncloud. The first one must not be done: You are not allowed to store the password to your GWDG Account on the HPC-frontend nodes to access your personal ownCloud space. This is highly unsafe. Never do this!.

The second option is better but also not recommended: You can create a special device password in ownCloud which will only work for ownCloud. However, this still gives full access to all your documents in ownCloud. This is also not recommended. If you want to do it anyway: You can create a dedicated password for ownCloud here. Then encrypt your Rclone configuration file by executing rclone config and hitting s to select Set configuration password. Your configuration is now encrypted, and every time you start Rclone you will have to supply the password. This provides full access to your owncloud account and all data within.

For security reasons, more fine-grained access can be authorized using the following, recommended method: To share data from ownCloud to HPC, only provide access to specific folders on ownCloud. For this purpose, create a public link to the folder, which should be secured with a password. The steps are as follows:

Create a public link to the folder
- Navigate ownCloud until the folder is visible in the list
- Click on ..., then Details, select the Sharing tab, Public Links, then Create Public Link
- Set link access depending on use case. Options are:
  - Download / View
  - Download / View / Upload
  - Download / View / Upload / Edit
- Set password and expiration date (highly recommended) This password will be referred to as <folder_password> in this document
Extract the folder ID from the public link which is required for the next steps The last portion of the link is the folder ID, i.e., https://owncloud.gwdg.de/index.php/s/<folder_id>
On the HPC system, load the rclone module: module load rclone

Run these commands on the HPC to download from or upload to the folder in ownCloud:

Download

rclone copy --webdav-url=https://owncloud.gwdg.de/public.php/webdav --webdav-user="<folder_id>" --webdav-pass="$(rclone obscure '<folder_password>')" --webdav-vendor=owncloud :webdav: <local_dir>

Upload

rclone copy --webdav-url=https://owncloud.gwdg.de/public.php/webdav --webdav-user="<folder_id>" --webdav-pass="$(rclone obscure '<folder_password>')" --webdav-vendor=owncloud <local_dir> :webdav:

Where <folder_id> is the ID extracted from the public link, <folder_password> is the password that was set when creating the public link, and <local_dir> is the local folder to synchronize with the folder in ownCloud.

When it’s not required anymore, remove the public link from the folder in ownCloud.

Tips and Tricks

Longer Data Transfers

Copying data can take longer if you move large amounts at the same time. When using rsync, use the option -P (short for --partial --progress), which allows you to resume interrupted transfers at the point they stopped and shows helpful information like the transfer speed and % complete for each file. When running transfer operations on our login nodes, and you don’t want to or can’t keep the terminal session open for the whole duration, there is the option to use a terminal multiplexer like tmux or screen. If your network connection is interrupted from time to time, it is strongly recommended to always run larger transfers in a tmux or screen session.

A guide on how to use both can be found on the terminal multiplexer page. Make sure you reconnect to the same login node to resume the session later!

rsync Command Line Arguments

rsync is a very powerful and versatile tool and accordingly, it has a lot of options and switches to finetune its behavior. The most basic ones that are useful in almost all circumstances are -a or --archive and -v or --verbose. The former is short for -rlptgoD (so you don’t have to remember those), which are various options to recursively copy directories and preserve most metadata (permissions, ownership, timestamps, symlinks and more). Notably, that does not include ACLs, extended attributes and times of last access. --verbose prints, as you might expect, the names or paths of each file and directory as they are copied.

Other often useful options include:
-P as mentioned above, useful or even critical for transfers of large files over slow or potentially unstable connections, but could hurt/slow down transfers of many small files.
-z or --compress to compress data in transfer, which can speed up transmission of certain types of data over slow connections, but can also (sometimes severely) slow down transmission of incompressible data or over fast connections.

For more options and information, please refer to the rsync manpage.

3-way Transfer (Remote to Remote)

While we generally do not recommend this method for many reasons, most of all because it’s inefficient and often very slow, there are cases where it can be a lot easier to transfer small amounts of data between two remote hosts you can both reach from your local machine. Basically, instead of establishing a direct connection between the two remotes, you channel all transferred data through your local machine. With scp, you can achieve this by using the switch -3.

Running Large Language Models (LLMs) on Grete

This guide walks you through deploying and running Large Language Models (LLMs) on the Grete GPU cluster at GWDG. Whether you’re fine-tuning a transformer or serving a chatbot, this manual ensures you can harness Grete’s full potential.

Prerequisites

Before you begin, ensure the following:

You have an active HPC account with access to Grete.
You are familiar with SSH, Slurm, and module environments.
Your project has sufficient GPU quota and storage.
You have a working conda or virtualenv environment (recommended).

💡 For account setup and access, refer to the Getting Started guide.

1. Connect to Grete

ssh <username>@glogin-gpu.hpc.gwdg.de

Once logged in, you can submit jobs to the GPU nodes via Slurm.

2. Load Required Modules

Grete provides pre-installed modules.

module load miniforge3

🧠 Tip: Use module spider to explore available versions.

3. Set Up Your Environment

Create a virtual environment using conda:

cd $PROJECT
create --prefix ./llm-env python=3.11
source activate llm-env
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install transformers accelerate

4. Prepare Your Script

Here’s a minimal example:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
model = AutoModelForCausalLM.from_pretrained("distilgpt2").to(device)

# Prepare input
input_text = "Once upon a time"
inputs = tokenizer(input_text, return_tensors="pt").to(device)  # Move inputs to same device

# Generate output
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

⚠️ Make sure your model fits into GPU memory. Use accelerate or bitsandbytes for optimization.

5. Submit a Slurm Job

Create a job script run_llm.sh:

#!/bin/bash
#SBATCH --job-name=llm-run
#SBATCH --partition=kisski
#SBATCH -G A100:1
#SBATCH --cpus-per-task=8
#SBATCH --mem=40G
#SBATCH --time=02:00:00
#SBATCH --output=llm_output.log
#SBATCH -C inet

module load miniforge3

source activate llm-env
python run_llm.py

Submit with:

sbatch run_llm.sh

6. Tips for Scaling

Use DeepSpeed, FSDP, or accelerate for multi-GPU training.
For inference, consider model quantization or ONNX export.
Monitor GPU usage with nvidia-smi.

7. Using GWDG’s LLM Service

If you prefer not to run your own model, GWDG offers a hosted LLM service with models. See GWDG LLM Service Overview for details.

Support

For help, contact the HPC support team at:

📧 hpc-support@gwdg.de

This page describes how you can collaboratively work on data within the HPC system and make it available to other users or to the outside world.

When running the show-quota command, you will see all data stores your current user has access to, divided into “User” and “Project” stores. Every member of your HPC Project Portal project can access the same project stores you see listed there. You can configure access for other project members to files owned by you via POSIX group permissions. If you are not familiar with the POSIX permission model, please take a look at our introductory course to learn the basics.

Please read Managing Permissions for general advice and tips on how to best manage your files.

Here is the list of methods you can utilize to make your data available to other users of the GWDG HPC systems. We recommend that you consider each method in the order they are listed here.

Applying for a project

If you are planning to closely collaborate with people from across multiple working groups for an extended period of time or invite external collaborators for a specific research topic (that should not have access to all your working group’s data), consider applying for a dedicated project. An HPC project will include a shared project directory and allows for dynamically inviting users via the HPC Project Portal. In order to apply as an SCC user, see Applying for SCC projects, otherwise contact our support.

Using a common POSIX group

You can make a directory (or individual files) in one of your data stores available to other users by changing its group to one that has you and the other users as members.

chgrp -R <group> <directory>
chmod -R g+rX <directory>

Remember that users of that group also need to be able to enter (not necessarily read) all parent directories as well. See Managing Permissions for more information and details on how to use chgrp.

Info

If you want to share data with your own different usernames, use the group HPC_u_<academicuser> (where academicuser is the username of your AcademicID). Please see the Data Migration Guide for more details.

In order to see what groups your current username and others have available, use id -nG <username>. To see all members of a group, use getent group <groupname>. Please make sure to only share data with groups that do not have many unintended members! Especially, do NOT use a group like HPC_all! Some groups (especially “primary” groups like UMIN or GWDG) don’t show any members, but actually have a very large number of members. Those should almost never be used. If there is no fitting group available, please apply for one (as described in the next paragraph).

Applying for a group

If a dedicated project is not feasible, and there is no good common POSIX group you already have, it is possible to apply for one by contacting our support. Please provide a good reason why you need a POSIX group, a unique group name and a list of usernames you want to be members of the new group.

Using a hidden directory

This is more of a temporary measure, and less secure than other methods, but quick and easy to do. The data will be available for all users that know the path to it. You should send the full path only to people you want to give access.

Warning

This method will leave all files and directories with non-empty other permissions under your chosen parent directory, especially those with more predictable names, open to all users of the HPC cluster!

Make sure you understand the basics of the POSIX permission model before attempting it!

While the top-level home or project directories do not have other permissions set by default, many files and subdirectories will likely have. Make sure to unset those before, for example:

chmod -R o= /mnt/vast-nhr/home/jdoe/u12345

Depending on the owning group, it may or may not be advisable to include g in the above command. Read this section to the end (especially the last info box below) for more context.

You have been warned!

Remember that in order to access a file or directory, one needs to be able to access all its parent directories as well. For directories, the read permission decides whether you can list its contents, the write permission decides if you can create or delete files in it, and the execute permission decides whether you can cd into it (or access files and subdirectories if their respective permissions allow for it). The trick is to make one of the parent directories executable only, but not readable.

Info

In this example, we will use the home directory, but you can apply the steps similarly to any other data store. Make sure to use the full canonical path (“real path”) to your directory to avoid confusion!

[nhr_ni_test] u12345@glogin4 ~ $ pwd
/user/jdoe/u12345
[nhr_ni_test] u12345@glogin4 ~ $ realpath .
/mnt/vast-nhr/home/jdoe/u12345

As you can see, the real path is different from the apparent path.

First, create a directory with a random name:

SHAREDIR=$(mktemp -p /mnt/vast-nhr/home/jdoe/u12345 -d share.XXXXXXXX)

This will create a directory with a random name in /mnt/vast-nhr/home/jdoe/u12345 and save the path in the variable SHAREDIR. You can now place the files you want to share in that directory. (Use tab-completion to avoid having to remember the random name or do e.g. cp some_file $SHAREDIR/)

Next, you need to set permissions to the directory:

chmod -R go+rX $SHAREDIR

And make sure the parent directory is not readable, only executable:

chmod go=x $SHAREDIR/..

(The above is equivalent to chmod go=x /mnt/vast-nhr/home/jdoe/u12345 in this example, but works regardless of where you chose to create the “hidden” directory.)

Info

We are changing group as well as other permissions here. This assumes your parent directory is owned by your user’s primary group, which is the case for most home directories by default. These primary groups are most likely very large groups (including everyone from your institution), so chances are high that some of the users you would like to share data with are members of the same primary group, but not all of them. Group permissions take precedence over other permissions, so just setting those permissions for others could exclude users that share your primary group.

If you have changed the owning group of your directory to your HPC_u_<academicuser> group or you placed the hidden directory under a project directory, please remove the g from the last two commands. For example: chmod o=x $SHAREDIR/..

In such cases, leave the group permissions untouched or feel free to set them however you prefer.

Your “hidden” directory is now ready, you can send the path to any other users you want to share the data with. To print the path of the shared directory, run:

echo $SHAREDIR

Users who know the path can cd into it and copy the files to their own directories. When the share is no longer needed, don’t forget to unset execute permission of the parent directory to restrict access to any other subdirectories or files again.

chmod go= /mnt/vast-nhr/home/jdoe/u12345

Using ACLs

ACLs (Access Control Lists) are a more advanced, but also more complex method of defining fine-grained permissions in addition to the traditional POSIX permissions. They work on most filesystems, but not all of them, and are not immediately visible and thus easier to forget or make mistakes with. ACLs should be preferred over the “hidden” directory method above when you want to share a directory long-term and to a smaller number of people. We still recommend to only use them when other methods (like a common project or POSIX group) are not available.

How to set and manage ACLs on your directories is described in our Data Migration Guide.

Last but not least, here are some options to share data with people that are not users of the GWDG HPC systems.

Using Owncloud

Many AcademicIDs have access to GWDGs instance of Owncloud. It can be used to upload the files you want to share directly from the cluster. See this paragraph for details on how to upload to Owncloud using rclone. After you have done so, you can either create a public link that allows anyone to download a copy, or use the sharing feature of Owncloud to select other AcademicCloud users that will be able to see the files after logging into Owncloud themselves.

Using S3

In order to use S3, you have to apply for an S3-Bucket by writing an email to support@gwdg.de and asking for one that is accessible from the HPC system. You can then share your secret key and private key within your group to give everyone access. In this scenario, access to your data is done via http, and it is reachable not only from the HPC systems, but also the Cloud and Internet (if needed).

You can access your S3-Bucket from a compute node using https://s3.gwdg.de as an endpoint.

In order to work with your S3-Bucket, you could for instance use rclone:

module load rclone
# List content of your Bucket
rclone ls <config-name>:<bucket-name>/<prefix>
# Or Snyc the Content of your $HOME with the Bucket
rclone sync -i $HOME/some/folder <config-name>:<bucket-name>/<prefix>
# Or Snyc the Content of your Bucket with your $HOME
rclone sync -i <config-name>:<bucket-name>/<prefix> $HOME/some/folder

This requires a config file in ~/.config/rclone/rclone.conf with the following content:

[<config-name>]
type = s3
provider = Ceph
env_auth = false
access_key_id = <AccessKey>
secret_access_key = <SecretKey>
region =
endpoint = https://s3.gwdg.de
location_constraint =
acl =
server_side_encryption =
storage_class

Managing permissions

On this page we curated general advice, tips and tricks, but also important caveats for managing permissions on your files and directories. All of these points are especially relevant when sharing or migrating data between different user accounts. At least a basic understanding of POSIX permissions is a prerequisite for anything discussed here. Please refer to our self-paced tutorial for begginners before reading this, if necessary.

Info

Recap:

Only the owner of a file or directory can change its group, permissions or ACLs.
If you have multiple usernames, only one of them can own a file/directory, and the operating system does not know your other usernames belong to the same person.
In order to access a given file, you need to also be able to access its parent directories, all the way up to /, the root of the filesystem (technically, you need to have execute permission on the directories)
Other permissions apply to any user that is not the owner or a member of the owning group
User permissions have precedence over group permissions, which in turn have precedence over other permissions. For example, if a file has r for other, but not for the group, members of the group can not read the file, but anyone else can
As an regular user, you are not able to change ownership of existing files or directories. Only our admins can do that in emergencies. But you can make a copy of a file owned by another user, as long as it is readable for you, and the copy will be owned by you.

Warning

Many directories have both a logical path like /user/your_name/u12345 and a real path that points to the actual location on the filesystem.

Please always operate on the real paths which are directories you can actually modify, unlike the symbolic links below /user or /projects which cannot be modified by users.

You can find out the real path with the following command:

realpath /path/to/directory

Alternatively, as a quick way to transparently resolve logical paths, you can add a trailing / at the end of the path when using it in a command.

General advice

Use the -R option for chmod or chgrp to recursively change the permissions/group for a directory and all its files and subdirectories.
Always use a capital X (instead of a lower-case one) when adding permissions via recursive chmod operations! This makes sure you’re only making directories and files executable for the group or others that are already executable by the owner, while a lower-case x would unconditionally make all files executable. It is generally a very bad idea to mark files as executable that are not supposed to be! It would be confusing in the best case and a potential security risk and risk to your data in the worst. To avoid that, use e.g.

chmod -R g+rwX example_dir

Set the SGID-bit on group-writable directories. This will cause newly created subdirectories to be owned by the same group as the parent, instead of the primary group of the user that created it (which would almost never be useful).

chmod g+s example_dir

Warning

Do not use chmod -R to recursively set the SGID-bit on a directory! This would also set the SGID-bit on all files in the directory, which is a potential security risk and generally a bad idea.

To recursively set the SGID-bit on a directory and all its subdirectories, but not on files, use:

find example_dir -type d -exec chmod g+s {} \;

When you recursively change a directory tree with symlinks in it, especially symlinks to another location, use the -h flag to chgrp to change the group of the symlink itself, rather than the destination file/directory. By default, chgrp only works on the destination, not the symlink itself, which is often not what you intended.

chgrp -Rh <group> example_dir

Set the sticky bit on a directory if you want others to be able to create new files and subdirectories in it, but forbid deleting, moving or renaming files owned by someone else:

chmod +t example_dir

Note

Users are able to delete or move files if they have write permission on the parent directory, even if they do not have write permission on the file itself. The sticky bit prevents that, for anyone but the owner of the directory.

Advanced commands and tricks

If you have a very large number of files/directories and the commands documented on this page take a long time to complete, here are some tips to speed it up.

Use the correct login node to run your commands. Accessing filesystems from a specific cluster island, while possible from login nodes dedicated to other islands, may be a lot slower than accessing them from the correct login nodes. See the Cluster Storage Map for an overview.
When changing permissions or the owning group for a larger number of files/directories, use a terminal multiplexer like tmux or screen, to allow the process to continue running in the background while you do something else, log off overnight or in case your connection drops out.
For large numbers of directories, set the SGID-bit with this more advanced command:

find <path> -type d \! -perm /g+s -print0 | xargs -0rn 200 chmod g+s

For a large number of files, some of which may already belong to the correct group or have group r/w/x permissions, changing the group and setting permissions can be sped up by running:

find <path> \! -group <group> -print0 | xargs -0rn 200 chgrp <group>
find <path> \! -perm /g+rw -print0 | xargs -0rn 200 chmod g+rwX

Terminal multiplexer

Some commands take longer like copying large amount of data. If you do not want to or can’t keep the terminal session open, there is the option to use either tmux or screen. Both are terminal multiplexers with different capabilities.

The basic idea is to log into a server via ssh and start a multiplexer session there. Running ssh in a multiplexer session does not help because the transfer would also stop is you close the session or loose the connection. Having a multiplexer session on the server allows you to log out and even turn of the computer. The transfer continues because it is running on the server in the background without the need for an open connection.

In order to reconnect, make sure you are on the same server/login node that you started the session on. For more information, see the paragraph Why can’t I resume my session?.

tmux

A new tmux session can be started with the command

tmux new -s my_session

Here the option -s my_session is a name so you can reconnect more easily later.

Info

If you get the message sessions should be nested with care, unset $TMUX to force, you are already in a tmux session.

In this session you can now start commands that run for longer. Leaving this session is called detaching and is done by pressing CTRL + b releasing, and then pressing d. Now you can close the ssh connection or the terminal and the tmux session will run in the background.

Reconnecting is done by this command

tmux attach -t my_session

A list of currently running sessions can be found with the command tmux ls. Ending a session can be done by executing the command exit in a tmux session or by using this command

tmux kill-session -t my_session

from outside of the session.

screen

Screen operates similar to tmux, when it comes to handling sessions, but the commands are a little bit different. To start a screen session named “my_session” for example, run:

screen -S my_session

Here you can run the commands you want. Detaching from this session is done by pressing CTRL + a, releasing and then pressing d.

A list of running screen sessions can be shown by running screen -ls. Reattaching can be done with the command

screen -r my_session

If there is only one session running, just screen -r also works, otherwise it will also show you the list of sessions.

Similarly, exiting a session can be done by using the exit command in the session or pressing CTRL + a, followed by k.

Why can’t I resume my session?

In order to reconnect to your session started on a specific server, it is required that you are on the same server/login node because the sessions are running locally and are not shared between them like your directories and files. Therefore, you need to remember the server’s name.

If you are on the wrong server, or your session has ended for some reason, you’ll get messages like no sessions, can't find session my_session or There is no screen to be resumed.

As an example, you connected to a random login node using ssh glogin.hpc.gwdg.de (technically a so called DNS-round-robin), landed on glogin6, started a tmux session and later got disconnected due to a timeout or network interruption. When you connect to the generic DNS name “glogin.hpc.gwdg.de” again, you will likely end up on a different login node than before, i.e. glogin4 and will be unable to resume your session. Refer to the table on the login nodes and examples page for an overview.

In order to avoid this issue, remember the hostname of the server you started the session on. For example, if your command prompt looks like this:

[u11324@glogin8 ~]$

you’re on glogin8. You can then directly ssh to the same node by running

ssh glogin8.hpc.gwdg.de

and should have no problems resuming your session.

If that still does not work, the login node may have been rebooted in the meantime, check your emails if there was an announcement or type the command uptime. This does not happen often, but is not avoidable sometimes.

SSH

Software Stacks

There are software stacks providing ready to use software beyond what is included in the base OS of the login and compute nodes, which are listed in the table below. The default software stack and which ones will work depends on the cluster island. All listed stacks are module-based systems. See Module Basics for more information on how to use modules.

Software Stack	Short Name	State	Lifespan	Module Software
GWDG Modules	`gwdg-lmod`	🟢 active, default		Lmod
NHR Modules	`nhr-lmod`	🟡 available, being phased out	EOL 2025/Q2	Lmod
SCC Modules	`scc-lmod`	🟡 available, being phased out	EOL 2025/Q2	Lmod
HLRN Modules	`hlrn-tmod`	🔴 retired, partially broken	EOL June 2024	Environment Modules (Tmod)

All our recent software stacks have modules compiled for several different CPU, GPU, and fabric architectures in use in the GWDG clusters and automatically pick a version optimized or at least compatible for the type of node where the module is loaded. The default software stack for each island and their states for the island are listed in the table below

Sub-cluster	Default	`gwdg-lmod`	`nhr-lmod`	`scc-lmod`	`hlrn-tmod`
Emmy Phase 3	`gwdg-lmod`	✅	✅	❌	untested, partially broken
Emmy Phase 2	`gwdg-lmod`	✅	✅	❌	partially broken
Emmy Phase 1	`gwdg-lmod`	✅	✅	❌	partially broken
Grete Phase 3	`gwdg-lmod`	✅	✅	❌	untested, partially broken
Grete Phase 2	`gwdg-lmod`	✅	✅	❌	partially broken
Grete Phase 1	`gwdg-lmod`	✅	✅	❌	partially broken
SCC Legacy (CPU)	`gwdg-lmod`	✅	untested	✅	untested, partially broken
SCC Legacy (GPU)	`gwdg-lmod`	✅	❌	✅	untested, partially broken
CIDBN	`gwdg-lmod`	✅	❌	✅	untested, partially broken
FG	`gwdg-lmod`	✅	❌	✅	untested, partially broken
SOE	`gwdg-lmod`	✅	❌	✅	untested, partially broken

Info

See CPU Partitions and GPU Partitions for the available partitions in each island for each kind of account.

See Logging In for the best login nodes for each island (other login nodes will often work, but may have access to different storage systems and their hardware will be less of a match).

See Cluster Storage Map for the storage systems accessible from each island and their relative performance characteristics.

See Types of User Accounts if you are unsure what kind of account you have.

Tip

GWDG Modules is built using the same system that was used to build NHR Modules. Thus, documentation referring to NHR Modules should work with GWDG Modules little to no modification.

Tip

Some of the currently available software package documentation refers to the HLRN Modules (hlrn-tmod) software stack (see that page for the packages), but most details still apply to their equivalent modules in the other software stacks.

Select a Preferred Stack

Change Default

To set the default, do either of the following before /etc/profile is sourced (or, source it after you set the environmental variable or file):

Set the environmental variable PREFERRED_SOFTWARE_STACK to the short name of the software stack you want
Create the file ~/.unified_hpc_profile in your HOME directory and write the short name of your preferred software stack to the file (e.g. echo gwdg-lmod > ~/.unified_hpc_profile).

If both are set, the value in PREFERRED_SOFTWARE_STACK takes priority.

Note that if the preferred software stack is not available for some reason, /etc/profile will fallback to whatever is available.

Changing Stack Mid-Session

You can also at any point change the software stack you are using in your current shell session by setting the PREFERRED_SOFTWARE_STACK environmental variable to the desired value and then sourcing the unified shell profile script by running the following shell commands (substitute in your desired software stack name):

Load prefered software stack:

export PREFERRED_SOFTWARE_STACK=DESIRED_STACK
source /sw/etc/profile/profile.sh

setenv PREFERRED_SOFTWARE_STACK DESIRED_STACK
source /sw/etc/profile/profile.csh

Module Basics

To make the provided software more manageable, the software is provided as modules in a module system. In many ways, it works like a package version control system letting multiple packages with different versions co-exist in non-default places letting the user and scripts select the exact ones they want to use at any given time. “loading a module” can be thought as synonymous for “making software accessible”.

All our recent software stacks use the Lmod module system that uses lua-based module files, which is why it is the only system documented on this page. It has some compatibility support for using modules written for the older Environment Modules (Tmod) without modification, making it easier to write your own modules for users familiar with that system (see Using Your Own Module Files for more information).

Getting Started

The most important commands to know are

module avail
module avail SEARCHTEXT
module help MODULE
module load MODULE/VERSION
module unload MODULE
module list

All modules have a version, specified as /VERSION. If the /VERSION part is omitted, the default version is used. So, one could do

module load gcc

to get the default GCC version, or

module load gcc/9.3.0

to get GCC 9.3.0.

Warning

The default version can be affected by other modules that have already been loaded. The primary example is loading the module for one MPI implementation and/or compiler will make the default version for modules to be selected only among the versions that use that specific MPI implementation and/or compiler.

Loading a module manipulates the shell environment to make the software visible for shell commands. Note, these changes of environmental variables are fully reversible and are undone if a module is unloaded. Loading a module may prepend the path of the directory containing the package’s programs to environmental variable PATH. This makes the package’s executables visible for the shell or script. The modules do similar things for other environmental variables like LIBRARY_PATH, MANPATH, etc. For example, if you want to work with GCC version 9, load the appropriate module and check its version.

module load gcc/9.3.0
gcc --version

The base operating system’s GCC is invisible to your shell as long as the module gcc is loaded, but becomes available again, if the module gcc is unloaded.

A list of all available modules arranged by compiler/MPI/cuda can be printed with the command module avail. A list of loaded modules is shown with the command module list. To learn the available commands and how to use them, you can run either of the following

module help
man module

When you get a user account, your environment is provided by the system profile and should be set up correctly to use modules. If you change your shell or modify your environment in such a way to make modules fail, contact the your consultant or start a support ticket for help. Additional documentation can be found in the Lmod documentation.

Module commands

Module commands can be issued

On the command line. This modifies the current shell environment, which makes the package’s binaries visible and defines environment variables to support compiling and linking. These modification are reversible and undone when unloading a module.
In Slurm batch scripts. This can be used to prepare the environment for jobs.

A complete list of all module commands can be seen by running module help. Below, a few examples are introduced.

Warning

It is also possible to put module commands in your ~/.bashrc (or equivalent for other shells). However, this may result in various conflicts and errors, particularly if a module is available on one node but not another or has to be loaded a different way. So we strongly recommend against doing this.

Available Modules Overview

module avail

This command shows all modules available to be loaded at the present time. Notice the version numbers and that some are marked as default. The modules are arranged by compiler/MPI/cuda version.

Warning

Modules that depend on choosing a specific compiler, MPI implementation, or other feature may not be visible unless the respective dependency modules have been loaded first. For example, modules in the GWDG Modules (gwdg-lmod) and NHR Modules (nhr-lmod) software stack compiled with the Intel OneAPI Compilers are not visible until after the intel-oneapi-compilers module has been loaded first.

You can also search for modules whose names contain STRING with the command:

module avail STRING

You can also search for modules containing STRING in their so-called whatis information by:

module spider STRING

Load and Unload Modules

module load MODULE
module load MODULE/VERSION
module unload MODULE
module unload MODULE/VERSION
module switch MODULE_1/VERSION_1 MODULE_2/VERSION_2
module purge

The load and unload commands load and unload the specified module exactly as expected, while the module switch command switches one module for another (unload the first and load the second). The purge command unloads ALL modules. If you don’t include the /VERSION part, the commands operate on the default version. Loading a module manipulates the shell environment to make software visible for shell commands. Note, these changes of environmental variables are fully reversible and are undone if a module is unloaded. For example, loading a module could prepend the path to the directory containing its executable programs to the environmental variable PATH. To see all currently loaded modules, use

module list

Getting module information

To investigate and learn more about a particular module, use the commands

module help MODULE
module help MODULE/VERSION
module whatis MODULE
module whatis MODULE/VERSION
module show MODULE
module show MODULE/VERSION

If you don’t include the /VERSION part, the commands operate on the default version. The help command shows the help string for the module. The whatis command shows the whatis information for the module. The show command shows all environment modifications the module would make if loaded.

Compiling and linking installed libraries using modules

Many software packages depend on installed libraries like the MPI implementation, FFTW 3, BLAS and LAPACK, HDF5, NetCDF, etc. On a normal machine, those libraries and headers would reside at well defined places on the filesystemsystem, where they are found by the compiler and linker and also by the loader during runtime. As the module system can provide several versions of those libraries, so they must be installed at non-default places. So the compiler, linker and loader need help to find them, especially the correct version. The dream of software users is that loading a module is all to be done to deliver this help. Unfortunately, this is not the case for several reasons.

Compilers and linkers use some well defined environmental variables to serve or get information on libraries. Typical examples are PATH, LD_RUNPATH, LD_LIBRARY_PATH, LIBRARY_PATH or PKG_CONFIG_PATH. Many modules modify them so that their libraries and programs can be found. To see which environmental variables are set by a module, run

module show MODULE/VERSION

However, for building complex software, build-tools like autotools or cmake are used, which come with own rules for how programs and libraries are found. In turn, the netcdf library uses nc-config and nf-config to deliver the information on the paths to the netcdf headers and libraries. All serious build tools use this option and a netcdf module just needs to deliver the path (PATH) to these tools. Very specific is cmake that searches for libraries instead of requesting information in a well defined way. This does not fit well in the philosophy of modules. Hence, there may be cases where you must consult the documentation of the software you are trying to build and/or the build system to get it to use software from the modules.

Another issue may be that successfully linked libraries are not found when running the executable. One may be tempted to load the related module again. However, in most cases this does not help to overcome the problem, as many modules do not have any effect during runtime. Here the solution is to avoid overlinking and to burn the path to the libraries into the executable. Read more on linking HDF5 and NetCDF libraries with the help of the related modules.

Using Your Own Module Files

It is possible to write your own module files (say to include your own software or manage environmental variables in an easily reversed way) and include them in the module tool’s search path so they can be used just like the provided modules. The module command looks through the colon separated directories in the MODULEPATH environmental variable in order. The order of directories in MODULEPATH determines the order with which they are printed by module avail and can determine which version of a module is the default if nothing else chooses the default (tie breaker is the first version found), so appending or prepending a directory to MODULEPATH can lead to different results. If you have your modules in directory DIRECTORY, you can preppend it to the search path by

module use DIRECTORY

or append it with the -a option like

module use -a DIRECTORY

You can remove the directory with the unuse command like

module unuse DIRECTORY

For actually making the modules, you must write the modulefiles themselves in one of the languages supported by the module system used by the software stack and use its API. The module systems and their supported languages for each software stack are given in the table below, along with links to the documentation for the module systems. It is generally easier to use existing module files as template.

Software Stack	Module System	Modules in Lua	Modules in TCL
GWDG Modules (gwdg-lmod)	Lmod	yes	yes (mostly compatible with Environment Modules (Tmod))
NHR Modules (nhr-lmod)	Lmod	yes	yes (mostly compatible with Environment Modules (Tmod))
SCC Modules (scc-lmod)	Lmod	yes	yes (mostly compatible with Environment Modules (Tmod))
HLRN Modules (hlrn-tmod)	Environment Modules (Tmod)		yes

Software Licensing

Some installed software packages have specific licensing restrictions. Users might have to contact our support to clarify the conditions or request access. Below we have listed the current licensing situation for these software packages.

Many of the software modules for installed software will print out a note that you may need to contact support if you load them. You can ignore these if you already have access!

If the software you need is not yet available on the system, please contact us!

Ansys

Our ansys license is restricted to academic usage only. Please check the Ansys page for details.

Comsol

We install COMSOL and provide the licensing server. You need to purchase your own licenses if you want to use it on the system and contact us.

Gaussian

We have a campus license for Gaussian and it is in theory available to all users. Please send us an email with the 3 licensing statements found on the Gaussian page in order to satisfy the licensing restrictions.

Matlab

SCC users from the Max Planck Society have full access to Matlab and SCC users from the University of Göttingen have 5 concurrent licenses available that they need to share. New SCC users do not need to request access anymore, it is automatic. Anyone else will have to bring their own licenses.

NAMD

Please write us a statement that you agree to the NAMD license and will not use NAMD for commercial purposes and we will grant you access.

VASP

Before you can request access, you need to be added by the holder of your research group’s VASP license to the VASP licensing portal. Please contact us afterward from the same email address that was added to the VASP licensing portal. The email address will be verified by VASP before we can grant you access.

Spack

Getting started

This page is maintained under how to use.

Available modules

On each cluster, one or more stacks of prebuilt software are provided for development, numerics, visualization, and various research disciplines. The software stacks, described in more detail on this page, are bundles of individually loadable modules, where various pre-built software packages are provided. As modules, multiple versions of the same software package can be provided in a way that avoids conflicts, which allows things like:

user A could use FFTW 3.3.8 compiled by GCC
user B could use FFTW 3.3.9 compiled by GCC in one job and 3.3.8 compiled by GCC in another job
user C could load FFTW 3.3.8 compiled by Intel compilers
user D could use FFTW 3.3.8 compiled by GCC with OpenMPI
user E could use FFTW 3.3.8 compiled by GCC with Intel MPI

Users can load, unload, and search for modules using the module command. See Module Basics for how to use modules.

The default software stack on the entire HPC system are the GWDG Modules.

When Software Isn’t Provided

If a software package is not available in the provided software stack or isn’t built in exactly the way you need (different version, different options/compiler flags, etc), there are two possible paths forward:

Ask if the software package can be added to one of the software stacks. The easiest way to do this is to write a support ticket. There is no guarantee that it will be added, and even if it is added it might not be available until the next module revision which could take months. But, if a software package is needed by many people and isn’t provided, this is how we find out that it should be provided.
Build it yourself on the cluster, possibly using packages in the modules as dependencies, compilers, environment managers, etc.

To help build software yourself, modules are provided for

conda/conda-forge, which has a vast collection of pre-built software with a focus on numeric and scientific applications. Run module avail miniforge3 to see the available versions.
Spack, which can build a decent set of packages from source allowing fine tuned control of compilation features and optimizing for specific hardware architectures. See the Spack page for more information.

Apptainer (formerly Singularity)

Description

Apptainer (formerly Singularity) is a free, cross-platform and open-source computer program that performs operating-system-level virtualization also known as containerization. One of the main uses of Apptainer is to bring containers and reproducibility to scientific computing and the high-performance computing (HPC) world.

The need for reproducibility requires the ability to use containers to move applications from system to system.

Using Apptainer containers, developers can work in reproducible environments of their choosing and design, and these complete environments can easily be copied and executed on other platforms.

To learn more about Apptainer itself please consult the Apptainer documentation.

Module

Load the modulefile

$ module load apptainer

This provides access to the apptainer executable, which can be used to download, build and run containers.

Building Apptainer images and running them as containers

On NHR you can build Apptainer images directly on the login nodes.

For SCC you should use a compute node. For example by starting an interactive job:

$ srun --partition int --pty bash
$ module load apptainer

If you have written a container definition foo.def you can create an Apptainer image foo.sif (sif meaning Singularity Image File) in the following way:

$ module load apptainer
$ apptainer build foo.sif foo.def

For writing container definitions see the official documentation

Example Jobscripts

Here is an example job of running the local Apptainer image (.sif)

Running Apptainer container

#!/bin/bash
#SBATCH -p medium40
#SBATCH -N 1
#SBATCH -t 60:00
 
module load apptainer
 
apptainer run --bind /mnt,/local,/user,/projects,$HOME,$WORK,$TMPDIR,$PROJECT  $HOME/foo.sif

#!/bin/bash
#SBATCH -p grete:shared
#SBATCH -N 1
#SBATCH -G 1
#SBATCH -t 60:00
 
module load cuda
module load apptainer
 
apptainer run --nv --bind /mnt,/local,/user,/projects,$HOME,$WORK,$TMPDIR,$PROJECT  $HOME/foo.sif

#!/bin/bash
#SBATCH -p medium
#SBATCH -N 1
#SBATCH -c 8
#SBATCH -t 1:00:00
 
module load apptainer
 
apptainer run --bind /mnt,/local,/user,/projects,/home,/scratch,/scratch-scc,$HOME  $HOME/foo.sif

Note

If you have installed or loaded extra software before running the container it can conflict with the containerized software. This can for example happen if $PERL5LIB is set after installing your own Perl modules.

In that case you can try to add the --cleanenv option to the apptainer command line or in extreme cases run the apptainer without bind-mounting your home directory (--no-home) with the current working directory (cd [...]) set to a clean directory.

An even better permanent solution would be to make sure that your software has to be explicitly loaded, e.g. clean up ~/.bashrc and remove any conda initialization routines, only set $PERL5LIB explicitly via a script you can source, and only install python packages in virtual environments. (This is strongly recommended in general even if you don’t work with apptainer.)

Examples

Several examples of Apptainer use cases will be shown below.

Jupyter with Apptainer

As an advanced example, you can pull and deploy the Apptainer image containing Jupyter.

Create a New Directory
Create a new folder in your $HOME directory and navigate to this directory.
Pull the Container Image
Pull a container image using public registries such as DockerHub. Here we will use a public image from quay.io, quay.io/jupyter/minimal-notebook. For a quicker option, consider building the container locally or loading it from DockerHub.
To pull the image, use the following command:
```
apptainer pull jupyter.sif docker://quay.io/jupyter/minimal-notebook
```
Don’t forget to run module load apptainer
Submit the Job
Once the jupyter.sif image is ready, you can submit the corresponding job to interact with the container. Start an interactive job (for more information, see the documentation on Interactive Jobs), start the container, and then a Jupyter instance inside of it. Take note of the node your interactive job has been allocated (or use the command hostname), we will need this in the next step.
```
srun -p jupyter --pty -n 1 -c 8 bash
srun: job 10592890 queued and waiting for resources
srun: job 10592890 has been allocated resources
u12345@ggpu02 ~ $ <- here you can see the allocated node
module load apptainer
```
Now, for a Jupyter container (here we are using one of the containers also used for Jupyter-HPC, but you can point to your own container):
```
apptainer exec --bind /mnt,/sw,/user,/user_datastore_map,/projects /sw/viz/jupyterhub-nhr/jupyter-containers/jupyter.sif jupyter notebook --no-browser
```
For an RStudio container:
```
apptainer exec --bind /mnt,/sw,/user,/user_datastore_map,/projects /sw/viz/jupyterhub-nhr/jupyter-containers/rstudio.sif jupyter-server
```
(the bind paths might not be necessary or might be different depending on what storage you need access to).
Both of these will produce a long output, but the relevant part is a line that includes the address of the spawned Jupyter or Rstudio server process. It will look something like the following (we will need it in the next point, so make a note of it!):
```
http://localhost:8888/rstudio?token=c85db148d85777f7d0c2e2df876351ff19872c593027b4a2
```
The port used will usually be 8888, make a note of it if it’s a different value.
Accessing the Jupyter Notebook In order to access the notebook you need to port-forward the port of the Jupyter server process from the compute node to your local workstation. Open another shell on your local workstation and run the following SSH command:
```
ssh -NL 8888:localhost:8888 -o ServerAliveInterval=60 -i YOUR_PRIVATE_KEY -J LOG_IN_NODE -l YOUR_HPC_USER HOSTNAME
```
Replace HOSTNAME with the value returned by hostname earlier/the name of the allocated node, YOUR_PRIVATE_KEY with the path to your private ssh key used to access the HPC, YOUR_HPC_USER with your username on the HPC, and the domain name of the log-in node you regularly use in place of LOG_IN_NODE. An example of this might look like:
```
ssh -NL 8888:localhost:8888 -o ServerAliveInterval=60 -i ~/.ssh/id-rsa -J glogin-p2.hpc.gwdg.de -l u12345 ggpu02
```
While your job is running, you can now access the spawned Jupyter server process via http://localhost:8888/ in a browser on your local computer (or paste the full address we obtained in the previous step).

NVIDIA GPU Access Within the Container

Hints for Advanced Users

If you want to build your own CUDA software from source, you can for example start with a container that already has the CUDA and cuDNN runtimes installed on the system level:

Bootstrap: docker
From: nvidia/cuda:12.8.0-cudnn-runtime-ubuntu24.04

%post
    [...]

When running the container don’t forget the --nv option.

It might also be possible to re-use the hosts CUDA runtime from the module system. The necessary environment manipulations should be similar to the following:

export PATH=$PATH:$CUDA_MODULE_INSTALL_PREFIX/bin
export LD_LIBRARY_PATH=$CUDA_MODULE_INSTALL_PREFIX/lib64:$CUDNN_MODULE_INSTALL_PREFIX/lib:$LD_LIBRARY_PATH
export CUDA_PATH=$CUDA_MODULE_INSTALL_PREFIX
export CUDA_ROOT=$CUDA_MODULE_INSTALL_PREFIX

Since this method requires --bind mounting software from the host system into the container it will stop the container from working on any other systems and is not recommended.

Tools in Containers

Multiple tools, libraries and programs are now provided as apptainer containers in /sw/container under various categories, such as /sw/container/bioinformatics. These containers allow us to provide tools that might otherwise prove difficult to install with our main package handling framework Spack. They also provide good examples for users who might want to set up their own containers.

You can get an overview by executing ls -l /sw/container/bioinformatics.

To use these on the SCC system you can use the following in your job scripts, e.g. to use the latest installed ipyrad version with 16 CPU cores:

module load apptainer

apptainer run --bind /mnt,/local,/user,/projects,/home,/scratch,/scratch-scc,$HOME \
    /sw/container/bioinformatics/ipyrad-latest.sif \
    ipyrad -p params-cool-animal.txt -s 1234 -c 16

You can do this in an interactive session, or a SLURM script. We do not recommend executing heavy containers on log-in nodes.

Currently we offer containers for:

Bioinformatics: agat, cutadapt, deeptools, homer, ipyrad, macs3, meme, minimap2, multiqc, Trinity, or umi_tools.
Quantum Simulators: see the corresponding page
JupyterHub containers, including RStudio. These can also be accessed through the JupyterHub service, or run directly like any other Apptainer container as explained in this page.

In the future these containers will also appear as pseudo-modules, for discoverability and visibility purposes.

Tip

The \ character at the end of the line (make sure there are no following spaces or tabs) signifies that the command continues in the next line. This can be used to visually separate the apptainer specific part of the command (apptainer … .sif) from the application specific part (ipyrad … 16) for better readability.

Slurm in Containers

Usually, you would submit a Slurm job and start your apptainer container within that, as we have seen in the previous examples. Sometimes, however, its the other way around, and you want to access Slurm (get cluster information or submit jobs) from within the container, the most notable example probably being Jupyter containers which you access interactively. Getting access to Slurm and its commands from within the container is not very difficult and only requires a few steps.

Firstly, your container should be based on Enterprise Linux (i.e. Rockylinux, Almalinux, CentOS, Redhat, Oracle Linux, etc.) in version 8 or 9, as those two are currently the only operating systems for which we provide Slurm. Running Slurm commands on a (for example) Debian based container following the steps below might work to an extent, but it is quite possible that required library versions differ between the two distributions. Therefore, such setups are not supported.

Next, during the container build phase, you’ll need to add the slurm user and its group, with a UID and GID of 450, e.g.

addgroup --system --gid 450 slurm
adduser --system --uid 450 --gid 450 --home /var/lib/slurm slurm

If you plan to use Job submission commands like srun, salloc or sbatch, you’ll also need to install the Lua library, as those commands contain a lua-based preprocessor, e.g.

dnf install -y lua-libs

Lastly, to run Slurm within the container, you’ll need to add the following bindings, which contain the Slurm binaries and the munge socket for authentication:

--bind /var/run/munge,/run/munge,/usr/lib64/libmunge.so.2,/usr/lib64/libmunge.so.2.0.0,/usr/local/slurm,/opt/slurm

With that, you’ll have access to Slurm from within the container, located at /usr/local/slurm/current/install/bin. You can for example add this directory to the PATH variable in your container recipe’s %environment section:

%environment
    export PATH="/usr/local/slurm/current/install/bin:${PATH}"

GWDG Modules (gwdg-lmod)

The default stack on the whole Unified HPC system in Göttingen. This stack uses Lmod as its module system. For the purposes of setting the desired software stack (see Software Stacks), its short name is gwdg-lmod. You can learn more about how to use the module system at Module Basics.

To see the available software, run

module avail

The modules for this stack are built for several combinations of CPU architecture and connection fabric to support the various kinds of nodes in the cluster. The right module for the node is automatically selected during module load.

Getting Started with gwdg-lmod

This software stack is enabled by default. Just login to glogin-p2.hpc.gwdg.de, glogin-p3.hpc.gwdg.de or glogin-gpu.hpc.gwdg.de and use the module avail, module spider and module load commands.

Below we have provided some example scripts that load the gromacs module and run a simple test case. You can copy the example script and adjust it to the modules you would like to use.

KISSKI and REACT users can take the Grete example and use --partition kisski or --partition react instead.

SCC users should use --partition scc-cpu (CPU only) or --partition scc-gpu (CPU+GPU) instead. If the microarchitecture on scc-cpu is important it should be selected with --constraint cascadelake (Emmy P2) or --constraint sapphirerapids (Emmy P3). The type and number of GPUs on the scc-gpu partition can be selected with the --gpus option instead.

For more information on how to specify the right slurm partition and hardware constraints please check out Slurm and Compute Partitions.

Note

The Cascade Lake nodes for SCC are currently still in the medium partition, so please use --partition medium instead of --partition scc-cpu --constraint cascadelake. This will be changed in a future downtime when the medium partition will be removed.

Tutorial: Gromacs with gwdg-lmod

The appropriate login nodes for this phase are glogin-p2.hpc.gwdg.de.

#!/bin/bash
#SBATCH --job-name="Emmy-P2-gromacs"
#SBATCH --output "slurm-%x-%j.out"
#SBATCH --error "slurm-%x-%j.err"
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 96
#SBATCH --partition standard96
#SBATCH --time 60:00

echo "================================ BATCH SCRIPT ================================" >&2
cat ${BASH_SOURCE[0]} >&2
echo "==============================================================================" >&2

module load gcc/14.2.0
module load openmpi/4.1.7
module load gromacs/2024.3

export OMP_NUM_THREADS=1

source $(which GMXRC)
mpirun gmx_mpi mdrun -s /sw/chem/gromacs/mpinat-benchmarks/benchPEP.tpr \
	-nsteps 1000 -dlb yes -v

The appropriate login nodes for this phase are glogin-p3.hpc.gwdg.de.

#!/bin/bash
#SBATCH --job-name="Emmy-P3-gromacs"
#SBATCH --output "slurm-%x-%j.out"
#SBATCH --error "slurm-%x-%j.err"
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 96
#SBATCH --partition medium96s
#SBATCH --time 60:00

echo "================================ BATCH SCRIPT ================================" >&2
cat ${BASH_SOURCE[0]} >&2
echo "==============================================================================" >&2

module load gcc/14.2.0
module load openmpi/4.1.7
module load gromacs/2024.3

export OMP_NUM_THREADS=1

source $(which GMXRC)
mpirun gmx_mpi mdrun -s /sw/chem/gromacs/mpinat-benchmarks/benchPEP.tpr \
	-nsteps 1000 -dlb yes -v

The appropriate login nodes for this phase are glogin-gpu.hpc.gwdg.de.

#!/bin/bash
#SBATCH --job-name="Grete-gromacs"
#SBATCH --output "slurm-%x-%j.out"
#SBATCH --error "slurm-%x-%j.err"
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 8
#SBATCH --gpus A100:4
#SBATCH --partition grete
#SBATCH --time 60:00

echo "================================ BATCH SCRIPT ================================" >&2
cat ${BASH_SOURCE[0]} >&2
echo "==============================================================================" >&2

module load gcc/13.2.0
module load openmpi/5.0.7
module load gromacs/2024.3


# OpenMP Threads * MPI Ranks = CPU Cores
export OMP_NUM_THREADS=8
export GMX_ENABLE_DIRECT_GPU_COMM=1

source $(which GMXRC)
mpirun gmx_mpi mdrun -s /sw/chem/gromacs/mpinat-benchmarks/benchPEP-h.tpr \
	-nsteps 1000 -v -pme gpu -update gpu -bonded gpu -npme 1

The appropriate login nodes for this phase are glogin-gpu.hpc.gwdg.de.

Note

The microarchitecture on the login node (AMD Rome) does not match the microarchitecture on the compute nodes (Intel Sapphire Rapids). In this case you should not compile your code on the login node, but use an interactive slurm job on the grete-h100 or grete-h100:shared partitions.

#!/bin/bash
#SBATCH --job-name="Grete-H100-gromacs"
#SBATCH --output "slurm-%x-%j.out"
#SBATCH --error "slurm-%x-%j.err"
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 8
#SBATCH --gpus H100:4
#SBATCH --partition grete-h100
#SBATCH --time 60:00

echo "================================ BATCH SCRIPT ================================" >&2
cat ${BASH_SOURCE[0]} >&2
echo "==============================================================================" >&2

module load gcc/13.2.0
module load openmpi/5.0.7
module load gromacs/2024.3

# OpenMP Threads * MPI Ranks = CPU Cores
export OMP_NUM_THREADS=12
export GMX_ENABLE_DIRECT_GPU_COMM=1

source $(which GMXRC)
mpirun gmx_mpi mdrun -s /sw/chem/gromacs/mpinat-benchmarks/benchPEP-h.tpr \
	-nsteps 1000 -v -pme gpu -update gpu -bonded gpu -npme 1

Hierarchical Module System

The module system has a Core - Compiler - MPI hierarchy. If you want to compile your own software, please load the appropriate compiler first and then the appropriate MPI module. This will make the modules that were compiled using this combination visible: if you run module avail you can see the additional modules at the top above the Core modules.

In previous revisions many more modules were visible in the Core group. To see a similar selection in the current software revision it should be enough to execute module load gcc openmpi first to load the default versions of the GCC and Open MPI modules.

If you want to figure out how to load a particular module that is not currently visible with module avail please use the module spider command.

Supported Compiler - MPI Combinations for Release 25.04

Supported Combinations

CUDA 12 is not fully compatible with GCC 14 - this compiler is not available on Grete.

module load gcc/14.2.0
module load openmpi/4.1.7
module avail

Grete uses the older GCC 13 compiler to be compatible with CUDA.

module load gcc/13.2.0
module load openmpi/5.0.7
module avail

Do not use the generic compilers mpicc, mpicxx, mpifc, mpigcc, mpigxx, mpif77, and mpif90! The Intel MPI compilers are mpiicx, mpiicpx, and mpiifx for C, C++, and Fortran respectively. The classic compilers mpiicc, mpiicpc and mpiifort were removed by Intel and are no longer available. It might be useful to set export SLURM_CPU_BIND=none when using Intel MPI.

module load intel-oneapi-compilers/2025.0.0
module load intel-oneapi-mpi/2021.14.0
module avail

OpenMPI will wrap around the modern Intel compilers icx (C), icpx (C++), and ifx (Fortran).

module load intel-oneapi-compilers/2025.0.0
module load openmpi/4.1.7
module avail

If a module is not available for your particular Compiler-MPI combination or you need a different compiler installed please contact HPC support.

Adding Your Own Modules

See Using Your Own Module Files.

Spack

Spack is provided as the spack module to help build your own software.

Migrating from SCC Modules (scc-lmod)

Many modules that were previously available from “Core” now require loading a compiler and MPI module first. Please use module spider to find these.

Many software packages that have a complicated tree of dependencies (including many python packages) have been moved into apptainer containers. Loading the respective module file will print a message that refers to the appropriate documentation page.

Software Revision 25.04

To figure out how to best load a module, you can use the module spider command, which will find all occurences of that module in the hierarchical module system. We recommend that you use the modules compiled with the latest GCC version for maximum performance.

CPU-only systems

The Emmy Core and GCC modules are available on the Skylake, Cascadelake and Sapphirerapids as well as the AMD systems. The Emmy Intel modules are only available on the Cascadelake and Sapphirerapids systems.

GPU systems

The Grete modules are available on all GPU systems.

Emmy Core Modules

These modules are loadable by default. No other modules have to be loaded first.

Tip

The software packages loaded on each phase are optimized for the particular CPU and GPU architecture (machine kind) of that phase (e.g Intel Cascadelake or Intel Sapphirerapids).

You can print the current machine kind by using the command: /sw/rev_profile/25.04/machine-kind

If you compile your own code (e.g. by using gcc or pip) please take care to compile on the same machine kind that the code will also be executed on.

List of Modules

Module Names	Description	Homepage
amduprof/5.1.701	AMD uProf (‘MICRO-prof’) is a software profiling analysis tool for x86 applications running on Windows, Linux and FreeBSD operating systems and provides event information unique to the AMD ‘Zen’-based processors and AMD Instinct(tm) MI Series accelerators. AMD uProf enables the developer to better understand the limiters of application performance and evaluate improvements.	https://www.amd.com/en/developer/uprof.html
amira/2022.1	Amira is a software platform for visualization, processing, and analysis of 3D and 4D data.
ansys/2023.1 ansys/2023.2 ansys/2024.1 ansys/2024.2 ansys/2025.2	Ansys offers a comprehensive software suite that spans the entire range of physics, providing access to virtually any field of engineering simulation that a design process requires.	https://www.ansys.com/
ant/1.10.14	Apache Ant is a Java library and command-line tool whose mission is to drive processes described in build files as targets and extension points dependent upon each other	https://ant.apache.org/
aocc/5.0.0	The AOCC compiler system is a high performance, production quality code generation tool. The AOCC environment provides various options to developers when building and optimizing C, C++, and Fortran applications targeting 32-bit and 64-bit Linux platforms. The AOCC compiler system offers a high level of advanced optimizations, multi-threading and processor support that includes global optimization, vectorization, inter-procedural analyses, loop transformations, and code generation. AMD also provides highly optimized libraries, which extract the optimal performance from each x86 processor core when utilized. The AOCC Compiler Suite simplifies and accelerates development and tuning for x86 applications.	https://www.amd.com/en/developer/aocc.html
apptainer/1.3.4 apptainer/1.4.1 apptainer/1.4.3	Apptainer is an open source container platform designed to be simple, fast, and secure. Many container platforms are available, but Apptainer is designed for ease-of-use on shared systems and in high performance computing (HPC) environments.	https://apptainer.org
autoconf/2.72	Autoconf – system configuration part of autotools	https://www.gnu.org/software/autoconf/
autoconf-archive/2023.02.20	The GNU Autoconf Archive is a collection of more than 500 macros for GNU Autoconf.	https://www.gnu.org/software/autoconf-archive/
automake/1.16.5	Automake – make file builder part of autotools	https://www.gnu.org/software/automake/
bat/0.24.0	A cat(1) clone with wings.	https://github.com/sharkdp/bat
bear/3.1.6	Bear is a tool that generates a compilation database for clang tooling from non-cmake build systems.	https://github.com/rizsotto/Bear
binutils/2.43.1	GNU binutils, which contain the linker, assembler, objdump and others	https://www.gnu.org/software/binutils/
bison/3.0.5 bison/3.8.2	Bison is a general-purpose parser generator that converts an annotated context-free grammar into a deterministic LR or generalized LR (GLR) parser employing LALR(1) parser tables.	https://www.gnu.org/software/bison/
cairo/1.16.0	Cairo is a 2D graphics library with support for multiple output devices.	https://www.cairographics.org/
cmake/3.30.5	A cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software.	https://www.cmake.org
comsol/6.1 comsol/6.2 comsol/6.3	COMSOL Multiphysics is a finite element analyzer, solver, and simulation software package for various physics and engineering applications, especially coupled phenomena and multiphysics.	https://www.comsol.com/
diffutils/3.10	GNU Diffutils is a package of several programs related to finding differences between files.	https://www.gnu.org/software/diffutils/
elfutils/0.191	elfutils is a collection of various binary tools such as eu-objdump, eu-readelf, and other utilities that allow you to inspect and manipulate ELF files. Refer to Table 5.Tools Included in elfutils for Red Hat Developer for a complete list of binary tools that are distributed with the Red Hat Developer Toolset version of elfutils.	https://fedorahosted.org/elfutils/
fd/10.2.0	A simple, fast and user-friendly alternative to ‘find’	https://github.com/sharkdp/fd
findutils/4.9.0	The GNU Find Utilities are the basic directory searching utilities of the GNU operating system.	https://www.gnu.org/software/findutils/
fish/3.7.1	fish is a smart and user-friendly command line shell for OS X, Linux, and the rest of the family.	https://fishshell.com/
flex/2.6.1	Flex is a tool for generating scanners.	https://github.com/westes/flex
font-util/1.4.1	X.Org font package creation/installation utilities and fonts.	https://cgit.freedesktop.org/xorg/font/util
fontconfig/2.15.0	Fontconfig is a library for configuring/customizing font access	https://www.freedesktop.org/wiki/Software/fontconfig/
freesurfer/7.4.1 freesurfer/8.0.0-1	Freesurfer is an open source software suite for processing and analyzing (human) brain MRI images.
freetype/2.13.2	FreeType is a freely available software library to render fonts. It is written in C, designed to be small, efficient, highly customizable, and portable while capable of producing high-quality output (glyph images) of most vector and bitmap font formats.	https://www.freetype.org/index.html
fribidi/1.0.12	GNU FriBidi: The Free Implementation of the Unicode Bidirectional Algorithm.	https://github.com/fribidi/fribidi
fzf/0.56.2	fzf is a general-purpose command-line fuzzy finder.	https://github.com/junegunn/fzf
gaussian/16-C.02	Gaussian is a computer program for computational chemistry	https://gaussian.com/
gawk/5.3.1	If you are like many computer users, you would frequently like to make changes in various text files wherever certain patterns appear, or extract data from parts of certain lines while discarding the rest. To write a program to do this in a language such as C or Pascal is a time-consuming inconvenience that may take many lines of code. The job is easy with awk, especially the GNU implementation: gawk.	https://www.gnu.org/software/gawk/
gcc/11.5.0 gcc/13.2.0 gcc/14.2.0	The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Ada, and Go, as well as libraries for these languages.	https://gcc.gnu.org
gettext/0.22.5	GNU internationalization (i18n) and localization (l10n) library.	https://www.gnu.org/software/gettext/
git/2.47.0	Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.	https://git-scm.com
git-lfs/3.5.1	Git LFS is a system for managing and versioning large files in association with a Git repository. Instead of storing the large files within the Git repository as blobs, Git LFS stores special ‘pointer files’ in the repository, while storing the actual file contents on a Git LFS server.	https://git-lfs.github.com
gmake/4.4.1	GNU Make is a tool which controls the generation of executables and other non-source files of a program from the program’s source files.	https://www.gnu.org/software/make/
gmp/6.3.0	GMP is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating-point numbers.	https://gmplib.org
go/1.23.2	The golang compiler and build environment	https://go.dev
gobject-introspection/1.78.1	The GObject Introspection is used to describe the program APIs and collect them in a uniform, machine readable format.Cairo is a 2D graphics library with support for multiple output	https://wiki.gnome.org/Projects/GObjectIntrospection
harfbuzz/10.0.1	The Harfbuzz package contains an OpenType text shaping engine.	https://github.com/harfbuzz/harfbuzz
igv/2.16.2	The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.	https://software.broadinstitute.org/software/igv/home
imagemagick/7.1.1-39	ImageMagick is a software suite to create, edit, compose, or convert bitmap images.	https://www.imagemagick.org
intel-oneapi-compilers/2023.2.4 intel-oneapi-compilers/2025.0.0	Intel oneAPI Compilers. Includes: icx, icpx, ifx, and ifort. Releases before 2024.0 include icc/icpc LICENSE INFORMATION: By downloading and using this software, you agree to the terms and conditions of the software license agreements at https://intel.ly/393CijO.	https://software.intel.com/content/www/us/en/develop/tools/oneapi.html
intel-oneapi-compilers-classic/2021.10.0	Relies on intel-oneapi-compilers to install the compilers, and configures modules for icc/icpc/ifort.	https://software.intel.com/content/www/us/en/develop/tools/oneapi.html
jacamar-ci/0.25.0	Jacamar CI is a HPC focused CI/CD driver for the GitLab custom executor.	https://gitlab.com/ecp-ci/jacamar-ci
jq/1.7.1	jq is a lightweight and flexible command-line JSON processor.	https://stedolan.github.io/jq/
julia/1.11.1 julia/1.11.6	The Julia Language: A fresh approach to technical computing This package installs the x86_64-linux-gnu version provided by Julia Computing	https://julialang.org/
lftp/4.9.2	LFTP is a sophisticated file transfer program supporting a number of network protocols (ftp, http, sftp, fish, torrent).	https://lftp.yar.ru/
libffi/3.4.6	The libffi library provides a portable, high level programming interface to various calling conventions. This allows a programmer to call any function specified by a call interface description at run time.	https://sourceware.org/libffi/
libiconv/1.17	GNU libiconv provides an implementation of the iconv() function and the iconv program for character set conversion.	https://www.gnu.org/software/libiconv/
libjpeg/9f	libjpeg is a widely used free library with functions for handling the JPEG image data format. It implements a JPEG codec (encoding and decoding) alongside various utilities for handling JPEG data.	http://www.ijg.org
libxml2/2.13.4	Libxml2 is the XML C parser and toolkit developed for the Gnome project (but usable outside of the Gnome platform), it is free software available under the MIT License.	https://gitlab.gnome.org/GNOME/libxml2/-/wikis/home
likwid/5.3.0	Likwid is a simple to install and use toolsuite of command line applications for performance oriented programmers. It works for Intel and AMD processors on the Linux operating system. This version uses the perf_event backend which reduces the feature set but allows user installs. See https://github.com/RRZE-HPC/likwid/wiki/TutorialLikwidPerf#feature-limitations for information.	https://hpc.fau.de/research/tools/likwid/
llvm/19.1.3	The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Despite its name, LLVM has little to do with traditional virtual machines, though it does provide helpful libraries that can be used to build them. The name ‘LLVM’ itself is not an acronym; it is the full name of the project.	https://llvm.org/
lz4/1.10.0	LZ4 is lossless compression algorithm, providing compression speed at 400 MB/s per core, scalable with multi-cores CPU. It also features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.	https://lz4.github.io/lz4/
m4/1.4.19	GNU M4 is an implementation of the traditional Unix macro processor.	https://www.gnu.org/software/m4/m4.html
mathematica/12.2.0	Mathematica: high-powered computation with thousands of Wolfram Language functions, natural language input, real-world data, mobile support.
matlab/R2021b matlab/R2022b matlab/R2023b matlab/R2024b	MATLAB (MATrix LABoratory) is a multi-paradigm numerical computing environment and fourth-generation programming language. A proprietary programming language developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages, including C, C++, C#, Java, Fortran and Python.
matlab-mcr/R2021b_Update_7 matlab-mcr/R2022b_Update_10 matlab-mcr/R2023b_Update_10 matlab-mcr/R2024b_Update_6	MATLAB Runtime runs compiled MATLAB applications or components without installing MATLAB. The MATLAB Runtime is a standalone set of shared libraries that enables the execution of compiled MATLAB applications or components. When used together, MATLAB, MATLAB Compiler, and the MATLAB Runtime enable you to create and distribute numerical applications or software components quickly and securely.
mercurial/6.7.3	Mercurial is a free, distributed source control management tool.	https://www.mercurial-scm.org
meson/1.5.1	Meson is a portable open source build system meant to be both extremely fast, and as user friendly as possible.	https://mesonbuild.com/
miniforge3/24.3.0-0	Miniforge3 is a minimal installer for conda and mamba specific to conda-forge.	https://github.com/conda-forge/miniforge
mkfontdir/1.0.7	mkfontdir creates the fonts.dir files needed by the legacy X server core font system. The current implementation is a simple wrapper script around the mkfontscale program, which must be built and installed first.	https://cgit.freedesktop.org/xorg/app/mkfontdir
mkfontscale/1.2.3	mkfontscale creates the fonts.scale and fonts.dir index files used by the legacy X11 font system.	https://gitlab.freedesktop.org/xorg/app/mkfontscale
mpfr/4.2.1	The MPFR library is a C library for multiple-precision floating-point computations with correct rounding.	https://www.mpfr.org/
ncurses/6.5	The ncurses (new curses) library is a free software emulation of curses in System V Release 4.0, and more. It uses terminfo format, supports pads and color and multiple highlights and forms characters and function-key mapping, and has all the other SYSV-curses enhancements over BSD curses.	https://invisible-island.net/ncurses/ncurses.html
ninja/1.12.1	Ninja is a small build system with a focus on speed. It differs from other build systems in two major respects: it is designed to have its input files generated by a higher-level build system, and it is designed to run builds as fast as possible.	https://ninja-build.org/
openjdk/17.0.11_9 openjdk/17.0.8.1_1	The free and open-source java implementation	https://openjdk.org/
parallel/20240822	GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input.	https://www.gnu.org/software/parallel/
patchelf/0.17.2	PatchELF is a small utility to modify the dynamic linker and RPATH of ELF executables.	https://nixos.org/patchelf.html
perl/5.40.0	Perl 5 is a highly capable, feature-rich programming language with over 27 years of development.	https://www.perl.org
perl-list-moreutils/0.430	Provide the stuff missing in List::Util	https://metacpan.org/pod/List::MoreUtils
perl-uri/5.08	Uniform Resource Identifiers (absolute and relative)	https://metacpan.org/pod/URI
perl-xml-libxml/2.0210	This module is an interface to libxml2, providing XML and HTML parsers with DOM, SAX and XMLReader interfaces, a large subset of DOM Layer 3 interface and a XML::XPath-like interface to XPath API of libxml2. The module is split into several packages which are not described in this section; unless stated otherwise, you only need to use XML::LibXML; in your programs.	https://metacpan.org/pod/XML::LibXML
pigz/2.8	A parallel implementation of gzip for modern multi-processor, multi-core machines.	https://zlib.net/pigz/
pkgconf/2.2.0	pkgconf is a program which helps to configure compiler and linker flags for development frameworks. It is similar to pkg-config from freedesktop.org, providing additional functionality while also maintaining compatibility.	http://pkgconf.org/
podman/5.5.0	An optionally rootless and daemonless container engine: alias docker=podman	https://podman.io
py-pip/23.1.2	The PyPA recommended tool for installing Python packages.	https://pip.pypa.io/
py-reportseff/2.7.6	A python script for tabular display of slurm efficiency information.	https://github.com/troycomi/reportseff
py-setuptools/69.2.0	A Python utility that aids in the process of downloading, building, upgrading, installing, and uninstalling Python packages.	https://github.com/pypa/setuptools
py-wheel/0.41.2	A built-package format for Python.	https://github.com/pypa/wheel
python/3.11.9	The Python programming language.	https://www.python.org/
python-venv/1.0	A Spack managed Python virtual environment	https://docs.python.org/3/library/venv.html
rclone/1.68.1	Rclone is a command line program to sync files and directories to and from various cloud storage providers	https://rclone.org
readline/8.2	The GNU Readline library provides a set of functions for use by applications that allow users to edit command lines as they are typed in. Both Emacs and vi editing modes are available. The Readline library includes additional functions to maintain a list of previously-entered command lines, to recall and perhaps reedit those lines, and perform csh-like history expansion on previous commands.	https://tiswww.case.edu/php/chet/readline/rltop.html
ripgrep/14.1.1	ripgrep is a line-oriented search tool that recursively searches your current directory for a regex pattern. ripgrep is similar to other popular search tools like The Silver Searcher, ack and grep.	https://github.com/BurntSushi/ripgrep
rust/1.70.0 rust/1.81.0 rust/1.85.0	The Rust programming language toolchain.	https://www.rust-lang.org
skopeo/0.1.40	skopeo is a command line utility that performs various operations on container images and image repositories.	https://github.com/containers/skopeo
spack/0.23.1	Spack is a multi-platform package manager that builds and installs multiple versions and configurations of software. It works on Linux, macOS, and many supercomputers. Spack is non-destructive: installing a new version of a package does not break existing installations, so many configurations of the same package can coexist.	https://spack.io/
spark/3.1.1 spark/3.5.1	Apache Spark is a fast and general engine for large-scale data processing.	https://spark.apache.org
sqlite/3.46.0	SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine.	https://www.sqlite.org
squashfuse/0.5.2	squashfuse - Mount SquashFS archives using Filesystem in USErspace (FUSE)	https://github.com/vasi/squashfuse
starccm/18.06.007 starccm/19.04.007 starccm/20.04.007	STAR-CCM+: Simcenter STAR-CCM+ is a multiphysics computational fluid dynamics (CFD) simulation software that enables CFD engineers to model the complexity and explore the possibilities of products operating under real-world conditions.	https://plm.sw.siemens.com/en-US/simcenter/fluids-thermal-simulation/star-ccm/
subversion/1.14.2	Apache Subversion - an open source version control system.	https://subversion.apache.org/
tar/1.34	GNU Tar provides the ability to create tar archives, as well as various other kinds of manipulation.	https://www.gnu.org/software/tar/
tcl/8.6.12	Tcl (Tool Command Language) is a very powerful but easy to learn dynamic programming language, suitable for a very wide range of uses, including web and desktop applications, networking, administration, testing and many more. Open source and business-friendly, Tcl is a mature yet evolving language that is truly cross platform, easily deployed and highly extensible.	https://www.tcl.tk/
tcsh/6.24.00	Tcsh is an enhanced but completely compatible version of csh, the C shell. Tcsh is a command language interpreter which can be used both as an interactive login shell and as a shell script command processor. Tcsh includes a command line editor, programmable word completion, spelling correction, a history mechanism, job control and a C language like syntax.	https://www.tcsh.org/
texinfo/7.1	Texinfo is the official documentation format of the GNU project.	https://www.gnu.org/software/texinfo/
tk/8.6.11	Tk is a graphical user interface toolkit that takes developing desktop applications to a higher level than conventional approaches. Tk is the standard GUI not only for Tcl, but for many other dynamic languages, and can produce rich, native applications that run unchanged across Windows, Mac OS X, Linux and more.	https://www.tcl.tk
tkdiff/5.7	TkDiff is a graphical front end to the diff program. It provides a side-by-side view of the differences between two text files, along with several innovative features such as diff bookmarks, a graphical map of differences for quick navigation, and a facility for slicing diff regions to achieve exactly the merge output desired.	https://tkdiff.sourceforge.io/
tmolex/2024.1 tmolex/2025	Turbomole package with TmoleX GUI (includes CLI tools).	https://www.turbomole.org/
tree/2.1.0	Tree is a recursive directory listing command that produces a depth indented listing of files, which is colorized ala dircolors if the LS_COLORS environment variable is set and output is to tty. Tree has been ported and reported to work under the following operating systems: Linux, FreeBSD, OS X, Solaris, HP/UX, Cygwin, HP Nonstop and OS/2.	http://mama.indstate.edu/users/ice/tree/
turbomole/7.8.1	TURBOMOLE: Program Package for ab initio Electronic Structure Calculations.	https://www.turbomole.org/
ucsc/2019-12-12	UCSC Genome Browser and Blat application binaries built for standalone command-line use.
uv/0.7.15	An extremely fast Python package and project manager, written in Rust.
valgrind/3.23.0	An instrumentation framework for building dynamic analysis.	https://valgrind.org/
vampir/10.5.0 vampir/10.7.0	Vampir and Score-P provide a performance tool framework with special focus on highly-parallel applications. Performance data is collected from multi-process (MPI, SHMEM), thread-parallel (OpenMP, Pthreads), as well as accelerator-based paradigms (CUDA, HIP, OpenCL, OpenACC).
vampirserver/10.5.0 vampirserver/10.7.0	Vampir and Score-P provide a performance tool framework with special focus on highly-parallel applications. Performance data is collected from multi-process (MPI, SHMEM), thread-parallel (OpenMP, Pthreads), as well as accelerator-based paradigms (CUDA, HIP, OpenCL, OpenACC).
vim/9.1.0437	Vim is a highly configurable text editor built to enable efficient text editing. It is an improved version of the vi editor distributed with most UNIX systems. Vim is often called a ‘programmer’s editor,’ and so useful for programming that many consider it an entire IDE. It’s not just for programmers, though. Vim is perfect for all kinds of text editing, from composing email to editing configuration files.	https://www.vim.org
vmd/1.9.3	VMD provides user-editable materials which can be applied to molecular geometry.	https://www.ks.uiuc.edu/Research/vmd/
which/2.21	GNU which - is a utility that is used to find which executable (or alias or shell function) is executed when entered on the shell prompt.	https://savannah.gnu.org/projects/which/
xz/5.4.6	XZ Utils is free general-purpose data compression software with high compression ratio. XZ Utils were written for POSIX-like systems, but also work on some not-so-POSIX systems. XZ Utils are the successor to LZMA Utils.	https://tukaani.org/xz/
zlib-ng/2.2.1	zlib replacement with optimizations for next generation systems.	https://github.com/zlib-ng/zlib-ng
zsh/5.8.1	Zsh is a shell designed for interactive use, although it is also a powerful scripting language. Many of the useful features of bash, ksh, and tcsh were incorporated into zsh; many original features were added.	https://www.zsh.org
zstd/1.5.6	Zstandard, or zstd as short version, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios.	https://facebook.github.io/zstd/

Emmy GCC11 Modules

It is required to load the appropriate compiler and MPI modules before any of the modules become visible (hierarchical module system):

module load gcc/11.5.0
module load openmpi/4.1.7

Tip

The software packages loaded on each phase are optimized for the particular CPU and GPU architecture (machine kind) of that phase (e.g Intel Cascadelake or Intel Sapphirerapids).

You can print the current machine kind by using the command: /sw/rev_profile/25.04/machine-kind

If you compile your own code (e.g. by using gcc or pip) please take care to compile on the same machine kind that the code will also be executed on.

List of Modules

Module Names	Description	Homepage
abyss/2.3.5	ABySS is a de novo, parallel, paired-end sequence assembler that is designed for short reads. The single-processor version is useful for assembling genomes up to 100 Mbases in size.	https://www.bcgsc.ca/platform/bioinfo/software/abyss
aria2/1.36.0	An ultra fast download utility	https://aria2.github.io
bcftools/1.16	BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.	https://samtools.github.io/bcftools/
bedops/2.4.41	BEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archiving, conversion and other management of genomic data of arbitrary scale.	https://bedops.readthedocs.io
blast-plus/2.14.1	Basic Local Alignment Search Tool.	https://blast.ncbi.nlm.nih.gov/
boost/1.83.0	Boost provides free peer-reviewed portable C++ source libraries, emphasizing libraries that work well with the C++ Standard Library.	https://www.boost.org
bowtie/1.3.1	Bowtie is an ultrafast, memory-efficient short read aligner for short DNA sequences (reads) from next-gen sequencers.	https://sourceforge.net/projects/bowtie-bio/
bwa/0.7.17	Burrow-Wheeler Aligner for pairwise alignment between DNA sequences.	https://github.com/lh3/bwa
cdo/2.2.2	CDO is a collection of command line operators to manipulate and analyse Climate and NWP model Data.	https://code.mpimet.mpg.de/projects/cdo
charmpp/6.10.2 charmpp/6.10.2-smp charmpp/7.0.0 charmpp/7.0.0-smp	Charm++ is a parallel programming framework in C++ supported by an adaptive runtime system, which enhances user productivity and allows programs to run portably from small multicore computers (your laptop) to the largest supercomputers.	https://charmplusplus.org
cmake/3.30.5	A cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software.	https://www.cmake.org
cp2k/2023.2 cp2k/2024.1 cp2k/2025.1	CP2K is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems	https://www.cp2k.org
cpmd/4.3	The CPMD code is a parallelized plane wave / pseudopotential implementation of Density Functional Theory, particularly designed for ab-initio molecular dynamics.	https://www.cpmd.org/wordpress/
crest/2.12	Conformer-Rotamer Ensemble Sampling Tool	https://github.com/crest-lab/crest
diamond/2.1.7	DIAMOND is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data.	https://ab.inf.uni-tuebingen.de/software/diamond
exciting/oxygen	exciting is a full-potential all-electron density-functional-theory package implementing the families of linearized augmented planewave methods. It can be applied to all kinds of materials, irrespective of the atomic species involved, and also allows for exploring the physics of core electrons. A particular focus are excited states within many-body perturbation theory.	https://exciting-code.org/
ffmpeg/6.0	FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video.	https://ffmpeg.org
fftw/3.3.10	FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST). We believe that FFTW, which is free software, should become the FFT library of choice for most applications.	https://www.fftw.org
fleur/5.1 fleur/7.2	FLEUR (Full-potential Linearised augmented plane wave in EURope) is a code family for calculating groundstate as well as excited-state properties of solids within the context of density functional theory (DFT).	https://www.flapw.de/MaX-5.1
flex/2.6.1	Flex is a tool for generating scanners.	https://github.com/westes/flex
foam-extend/4.1 foam-extend/4.1-debug foam-extend/4.1-solids4foam foam-extend/5.0	The Extend Project is a fork of the OpenFOAM open-source library for Computational Fluid Dynamics (CFD). This offering is not approved or endorsed by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark.	https://sourceforge.net/projects/foam-extend/
fribidi/1.0.12	GNU FriBidi: The Free Implementation of the Unicode Bidirectional Algorithm.	https://github.com/fribidi/fribidi
gatk/3.8.1 gatk/4.4.0.0	Genome Analysis Toolkit Variant Discovery in High-Throughput Sequencing Data	https://gatk.broadinstitute.org/hc/en-us
gdal/3.7.3	GDAL: Geospatial Data Abstraction Library.	https://www.gdal.org/
gdb/15.2	GDB, the GNU Project debugger, allows you to see what is going on ‘inside’ another program while it executes – or what another program was doing at the moment it crashed.	https://www.gnu.org/software/gdb
globalarrays/5.8.2	Global Arrays (GA) is a Partitioned Global Address Space (PGAS) programming model.	https://hpc.pnl.gov/globalarrays/
gmp/6.2.1	GMP is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating-point numbers.	https://gmplib.org
gnuplot/5.4.3	Gnuplot is a portable command-line driven graphing utility for Linux, OS/2, MS Windows, OSX, VMS, and many other platforms. The source code is copyrighted but freely distributed (i.e., you don’t have to pay for it). It was originally created to allow scientists and students to visualize mathematical functions and data interactively, but has grown to support many non-interactive uses such as web scripting. It is also used as a plotting engine by third-party applications like Octave. Gnuplot has been supported and under active development since 1986	http://www.gnuplot.info
gobject-introspection/1.78.1	The GObject Introspection is used to describe the program APIs and collect them in a uniform, machine readable format.Cairo is a 2D graphics library with support for multiple output	https://wiki.gnome.org/Projects/GObjectIntrospection
grads/2.2.3	The Grid Analysis and Display System (GrADS) is an interactive desktop tool that is used for easy access, manipulation, and visualization of earth science data. GrADS has two data models for handling gridded and station data. GrADS supports many data file formats, including binary (stream or sequential), GRIB (version 1 and 2), NetCDF, HDF (version 4 and 5), and BUFR (for station data).	http://cola.gmu.edu/grads/grads.php
gromacs/2019.6 gromacs/2019.6-plumed gromacs/2022.5-plumed gromacs/2023-plumed gromacs/2023.3	GROMACS is a molecular dynamics package primarily designed for simulations of proteins, lipids and nucleic acids. It was originally developed in the Biophysical Chemistry department of University of Groningen, and is now maintained by contributors in universities and research centers across the world.	https://www.gromacs.org
gsl/2.7.1	The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. It is free software under the GNU General Public License. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.	https://www.gnu.org/software/gsl
harfbuzz/10.0.1	The Harfbuzz package contains an OpenType text shaping engine.	https://github.com/harfbuzz/harfbuzz
hdf5/1.12.2 hdf5/1.14.5	HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of data types, and is designed for flexible and efficient I/O and for high volume and complex data.	https://support.hdfgroup.org
hpl/2.3	HPL is a software package that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers. It can thus be regarded as a portable as well as freely available implementation of the High Performance Computing Linpack Benchmark.	https://www.netlib.org/benchmark/hpl/
intel-oneapi-advisor/2023.2.0	Intel Advisor is a design and analysis tool for developing performant code. The tool supports C, C++, Fortran, SYCL, OpenMP, OpenCL code, and Python. It helps with the following: Performant CPU Code: Design your application for efficient threading, vectorization, and memory use. Efficient GPU Offload: Identify parts of the code that can be profitably offloaded. Optimize the code for compute and memory.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/advisor.html
intel-oneapi-dal/2023.2.0	Intel oneAPI Data Analytics Library (oneDAL) is a library that helps speed up big data analysis by providing highly optimized algorithmic building blocks for all stages of data analytics (preprocessing, transformation, analysis, modeling, validation, and decision making) in batch, online, and distributed processing modes of computation. The library optimizes data ingestion along with algorithmic computation to increase throughput and scalability.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html
intel-oneapi-dnn/2023.2.0	The Intel oneAPI Deep Neural Network Library (oneDNN) helps developers improve productivity and enhance the performance of their deep learning frameworks. It supports key data type formats, including 16 and 32-bit floating point, bfloat16, and 8-bit integers and implements rich operators, including convolution, matrix multiplication, pooling, batch normalization, activation functions, recurrent neural network (RNN) cells, and long short-term memory (LSTM) cells.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onednn.html
intel-oneapi-inspector/2023.2.0	Intel Inspector is a dynamic memory and threading error debugger for C, C++, and Fortran applications that run on Windows and Linux operating systems. Save money: locate the root cause of memory, threading, and persistence errors before you release. Save time: simplify the diagnosis of difficult errors by breaking into the debugger just before the error occurs. Save effort: use your normal debug or production build to catch and debug errors. Check all code, including third-party libraries with unavailable sources.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/inspector.html
intel-oneapi-mkl/2023.2.0 intel-oneapi-mkl/2024.2.2	Intel oneAPI Math Kernel Library (Intel oneMKL; formerly Intel Math Kernel Library or Intel MKL), is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html
intel-oneapi-tbb/2021.10.0	Intel oneAPI Threading Building Blocks (oneTBB) is a flexible performance library that simplifies the work of adding parallelism to complex applications across accelerated architectures, even if you are not a threading expert.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onetbb.html
intel-oneapi-vtune/2023.2.0	Intel VTune Profiler is a profiler to optimize application performance, system performance, and system configuration for HPC, cloud, IoT, media, storage, and more. CPU, GPU, and FPGA: Tune the entire application’s performance–not just the accelerated portion. Multilingual: Profile SYCL, C, C++, C#, Fortran, OpenCL code, Python, Google Go programming language, Java, .NET, Assembly, or any combination of languages. System or Application: Get coarse-grained system data for an extended period or detailed results mapped to source code. Power: Optimize performance while avoiding power and thermal-related throttling.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html
iq-tree/2.2.2.7	IQ-TREE Efficient software for phylogenomic inference	http://www.iqtree.org
jags/4.3.0	JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation not wholly unlike BUGS	https://mcmc-jags.sourceforge.net/
jellyfish/2.3.1	JELLYFISH is a tool for fast, memory-efficient counting of k-mers in DNA.	https://www.cbcb.umd.edu/software/jellyfish/
kraken2/2.1.2	Kraken2 is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.	https://ccb.jhu.edu/software/kraken2/
lammps/20230802.4 lammps/20230802.4-plumed	LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel Simulator.	https://www.lammps.org/
libaec/1.0.6	Libaec provides fast lossless compression of 1 up to 32 bit wide signed or unsigned integers (samples). It implements Golomb-Rice compression method under the BSD license and includes a free drop-in replacement for the SZIP library.	https://gitlab.dkrz.de/k202009/libaec
libfabric/2.0.0	The Open Fabrics Interfaces (OFI) is a framework focused on exporting fabric communication services to applications.	https://libfabric.org/
libffi/3.4.6	The libffi library provides a portable, high level programming interface to various calling conventions. This allows a programmer to call any function specified by a call interface description at run time.	https://sourceware.org/libffi/
libpng/1.6.39	libpng is the official PNG reference library.	http://www.libpng.org/pub/png/libpng.html
libxc/6.2.2	Libxc is a library of exchange-correlation functionals for density-functional theory.	https://libxc.gitlab.io/
libxml2/2.13.4	Libxml2 is the XML C parser and toolkit developed for the Gnome project (but usable outside of the Gnome platform), it is free software available under the MIT License.	https://gitlab.gnome.org/GNOME/libxml2/-/wikis/home
libxsmm/1.17	Library for specialized dense and sparse matrix operations, and deep learning primitives.	https://github.com/hfp/libxsmm
masurca/4.1.0 masurca/4.1.1	MaSuRCA is whole genome assembly software. It combines the efficiency of the de Bruijn graph and Overlap-Layout-Consensus (OLC) approaches.	https://www.genome.umd.edu/masurca.html
meson/1.5.1	Meson is a portable open source build system meant to be both extremely fast, and as user friendly as possible.	https://mesonbuild.com/
metis/5.1.0	METIS is a set of serial programs for partitioning graphs, partitioning finite element meshes, and producing fill reducing orderings for sparse matrices.	https://github.com/KarypisLab/METIS
mhm2/2.2.0.0	MetaHipMer (MHM) is a de novo metagenome short-read assembler, which is written in UPC++, CUDA and HIP, and runs efficiently on both single servers and on multinode supercomputers, where it can scale up to coassemble terabase-sized metagenomes.	https://bitbucket.org/berkeleylab/mhm2/
molden/6.7	A package for displaying Molecular Density from various Ab Initio packages	https://www.theochem.ru.nl/molden/
mono/6.12.0.122	Mono is a software platform designed to allow developers to easily create cross platform applications. It is an open source implementation of Microsoft’s .NET Framework based on the ECMA standards for C# and the Common Language Runtime.	https://www.mono-project.com/
mpfr/3.1.6 mpfr/4.2.0	The MPFR library is a C library for multiple-precision floating-point computations with correct rounding.	https://www.mpfr.org/
mpifileutils/0.11.1	mpiFileUtils is a suite of MPI-based tools to manage large datasets, which may vary from large directory trees to large files. High-performance computing users often generate large datasets with parallel applications that run with many processes (millions in some cases). However those users are then stuck with single-process tools like cp and rm to manage their datasets. This suite provides MPI-based tools to handle typical jobs like copy, remove, and compare for such datasets, providing speedups of up to 20-30x.	https://github.com/hpc/mpifileutils
mumps/5.2.0 mumps/5.7.3	MUMPS: a MUltifrontal Massively Parallel sparse direct Solver	https://mumps-solver.org
muscle5/5.1.0	MUSCLE is widely-used software for making multiple alignments of biological sequences.	https://drive5.com/muscle5/
must/1.9.0	MUST detects usage errors of the Message Passing Interface (MPI) and reports them to the user. As MPI calls are complex and usage errors common, this functionality is extremely helpful for application developers that want to develop correct MPI applications. This includes errors that already manifest: segmentation faults or incorrect results as well as many errors that are not visible to the application developer or do not manifest on a certain system or MPI implementation.	https://www.i12.rwth-aachen.de/go/id/nrbe
namd/2.14 namd/2.14-smp namd/3.0.1 namd/3.0.1-smp	NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.	https://www.ks.uiuc.edu/Research/namd/
nco/5.1.6	The NCO toolkit manipulates and analyzes data stored in netCDF-accessible formats	https://nco.sourceforge.net/
ncview/2.1.9	Simple viewer for NetCDF files.	https://cirrus.ucsd.edu/ncview/
netcdf-c/4.9.2	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the C distribution.	https://www.unidata.ucar.edu/software/netcdf
netcdf-fortran/4.6.1	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the Fortran distribution.	https://www.unidata.ucar.edu/software/netcdf
netgen/5.3.1	NETGEN is an automatic 3d tetrahedral mesh generator. It accepts input from constructive solid geometry (CSG) or boundary representation (BRep) from STL file format. The connection to a geometry kernel allows the handling of IGES and STEP files. NETGEN contains modules for mesh optimization and hierarchical mesh refinement.	https://ngsolve.org/
netlib-lapack/3.11.0	LAPACK version 3.X is a comprehensive FORTRAN library that does linear algebra operations including matrix inversions, least squared solutions to linear sets of equations, eigenvector analysis, singular value decomposition, etc. It is a very comprehensive and reputable package that has found extensive use in the scientific community.	https://www.netlib.org/lapack/
netlib-scalapack/2.2.0	ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines	https://www.netlib.org/scalapack/
nextflow/23.10.0	Data-driven computational pipelines.	https://www.nextflow.io
ninja/1.12.1	Ninja is a small build system with a focus on speed. It differs from other build systems in two major respects: it is designed to have its input files generated by a higher-level build system, and it is designed to run builds as fast as possible.	https://ninja-build.org/
octave/9.1.0	GNU Octave is a high-level language, primarily intended for numerical computations.	https://www.gnu.org/software/octave/
openbabel/3.1.1	Open Babel is a chemical toolbox designed to speak the many languages of chemical data. It’s an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, solid-state materials, biochemistry, or related areas.	https://openbabel.org/docs/index.html
openblas/0.3.24	OpenBLAS: An optimized BLAS library	https://www.openblas.net
opencoarrays/2.10.1	OpenCoarrays is an open-source software project that produces an application binary interface (ABI) supporting coarray Fortran (CAF) compilers, an application programming interface (API) that supports users of non-CAF compilers, and an associated compiler wrapper and program launcher.	http://www.opencoarrays.org/
openfoam/2306 openfoam/2312 openfoam/2406 openfoam/2412	OpenFOAM is a GPL-open-source C++ CFD-toolbox. This offering is supported by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark. OpenCFD Ltd has been developing and releasing OpenFOAM since its debut in 2004.	https://www.openfoam.com/
openfoam-org/10 openfoam-org/6 openfoam-org/7 openfoam-org/8 openfoam-org/9	OpenFOAM is a GPL-open-source C++ CFD-toolbox. The openfoam.org release is managed by the OpenFOAM Foundation Ltd as a licensee of the OPENFOAM trademark. This offering is not approved or endorsed by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark.	https://www.openfoam.org/
openmpi/4.1.6 openmpi/4.1.7	An open source Message Passing Interface implementation.	https://www.open-mpi.org
osu-micro-benchmarks/7.3	The Ohio MicroBenchmark suite is a collection of independent MPI message passing performance microbenchmarks developed and written at The Ohio State University. It includes traditional benchmarks and performance measures such as latency, bandwidth and host overhead and can be used for both traditional and GPU-enhanced nodes.	https://mvapich.cse.ohio-state.edu/benchmarks/
paraview/5.11.2	ParaView is an open-source, multi-platform data analysis and visualization application. This package includes the Catalyst in-situ library for versions 5.7 and greater, otherwise use the catalyst package.	https://www.paraview.org
parmetis/4.0.3	ParMETIS is an MPI-based parallel library that implements a variety of algorithms for partitioning unstructured graphs, meshes, and for computing fill-reducing orderings of sparse matrices.	https://github.com/KarypisLab/ParMETIS
patchelf/0.17.2	PatchELF is a small utility to modify the dynamic linker and RPATH of ELF executables.	https://nixos.org/patchelf.html
pbmpi/1.9	A Bayesian software for phylogenetic reconstruction using mixture models	https://github.com/bayesiancook/pbmpi
petsc/3.20.1-complex petsc/3.20.1-real	PETSc is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations.	https://petsc.org
pigz/2.8	A parallel implementation of gzip for modern multi-processor, multi-core machines.	https://zlib.net/pigz/
plink/1.9-beta7.7	PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.	https://www.cog-genomics.org/plink/1.9/
precice/3.2.0	preCICE (Precise Code Interaction Coupling Environment) is a coupling library for partitioned multi-physics simulations. Partitioned means that preCICE couples existing programs (solvers) capable of simulating a subpart of the complete physics involved in a simulation.	https://precice.org/
proj/9.2.1	PROJ is a generic coordinate transformation software, that transforms geospatial coordinates from one coordinate reference system (CRS) to another. This includes cartographic projections as well as geodetic transformations.	https://proj.org/
psi4/1.8.2	Psi4 is an open-source suite of ab initio quantum chemistry programs designed for efficient, high-accuracy simulations of a variety of molecular properties.	https://www.psicode.org/
python/3.11.9	The Python programming language.	https://www.python.org/
quantum-espresso/6.7	Quantum ESPRESSO is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.	https://quantum-espresso.org
r/4.4.0	R is ‘GNU S’, a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc. Please consult the R project homepage for further information.	https://www.r-project.org
r-codetools/0.2-19	Code analysis tools for R.	https://cloud.r-project.org/package=codetools
r-gmp/0.7-1	Multiple Precision Arithmetic.	https://cloud.r-project.org/package=gmp
r-raster/3.6-20	Geographic Data Analysis and Modeling.	https://cloud.r-project.org/package=raster
r-sf/1.0-12	Simple Features for R.	https://cloud.r-project.org/package=sf
r-terra/1.7-78	Spatial Data Analysis.	https://cloud.r-project.org/package=terra
relion/3.1.3 relion/4.0.1 relion/5.0.0	RELION (for REgularised LIkelihood OptimisatioN, pronounce rely-on) is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).	https://www2.mrc-lmb.cam.ac.uk/relion
repeatmasker/4.1.5	RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.	https://www.repeatmasker.org
repeatmodeler/2.0.4	RepeatModeler is a de-novo repeat family identification and modeling package.	https://github.com/Dfam-consortium/RepeatModeler
revbayes/1.2.2	Bayesian phylogenetic inference using probabilistic graphical models and an interpreted language.	https://revbayes.github.io
rsync/3.4.1	An open source utility that provides fast incremental file transfer.	https://rsync.samba.org
salmon/1.10.3	Salmon is a tool for quantifying the expression of transcripts using RNA-seq data.	https://combine-lab.github.io/salmon/
samtools/1.17	SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format	https://www.htslib.org
scala/2.13.1	Scala is a general-purpose programming language providing support for functional programming and a strong static type system. Designed to be concise, many of Scala’s design decisions were designed to build from criticisms of Java.	https://www.scala-lang.org/
scalasca/2.6.1	Scalasca is a software tool that supports the performance optimization of parallel programs by measuring and analyzing their runtime behavior. The analysis identifies potential performance bottlenecks - in particular those concerning communication and synchronization - and offers guidance in exploring their causes.	https://www.scalasca.org
scorep/8.3	The Score-P measurement infrastructure is a highly scalable and easy-to-use tool suite for profiling, event tracing, and online analysis of HPC applications.	https://www.vi-hps.org/projects/score-p
scotch/7.0.4	Scotch is a software package for graph and mesh/hypergraph partitioning, graph clustering, and sparse matrix ordering.	https://gitlab.inria.fr/scotch/scotch
siesta/4.0.2	SIESTA performs electronic structure calculations and ab initio molecular dynamics simulations of molecules and solids.	https://departments.icmab.es/leem/siesta/
slepc/3.20.1	Scalable Library for Eigenvalue Problem Computations.	https://slepc.upv.es
snakemake/7.22.0	Workflow management system to create reproducible and scalable data analyses.	https://snakemake.readthedocs.io/en
subread/2.0.6	The Subread software package is a tool kit for processing next-gen sequencing data.	https://subread.sourceforge.net/
tk/8.6.11	Tk is a graphical user interface toolkit that takes developing desktop applications to a higher level than conventional approaches. Tk is the standard GUI not only for Tcl, but for many other dynamic languages, and can produce rich, native applications that run unchanged across Windows, Mac OS X, Linux and more.	https://www.tcl.tk
transabyss/2.0.1	de novo assembly of RNA-Seq data using ABySS
ucx/1.18.0	a communication library implementing high-performance messaging for MPI/PGAS frameworks	https://www.openucx.org
udunits/2.2.28	Automated units conversion	https://www.unidata.ucar.edu/software/udunits
valgrind/3.20.0	An instrumentation framework for building dynamic analysis.	https://valgrind.org/
xtb/6.6.0	Semiempirical extended tight binding program package	https://xtb-docs.readthedocs.org

Emmy GCC14 Modules

It is required to load the appropriate compiler and MPI modules before any of the modules become visible (hierarchical module system):

module load gcc/14.2.0
module load openmpi/4.1.7

Tip

The software packages loaded on each phase are optimized for the particular CPU and GPU architecture (machine kind) of that phase (e.g Intel Cascadelake or Intel Sapphirerapids).

You can print the current machine kind by using the command: /sw/rev_profile/25.04/machine-kind

If you compile your own code (e.g. by using gcc or pip) please take care to compile on the same machine kind that the code will also be executed on.

List of Modules

Module Names	Description	Homepage
abyss/2.3.10	ABySS is a de novo, parallel, paired-end sequence assembler that is designed for short reads. The single-processor version is useful for assembling genomes up to 100 Mbases in size.	https://www.bcgsc.ca/platform/bioinfo/software/abyss
aria2/1.37.0	An ultra fast download utility	https://aria2.github.io
bcftools/1.16 bcftools/1.19	BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.	https://samtools.github.io/bcftools/
beast2/2.7.4	BEAST is a cross-platform program for Bayesian inference using MCMC of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology.	http://beast2.org/
bedops/2.4.41	BEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archiving, conversion and other management of genomic data of arbitrary scale.	https://bedops.readthedocs.io
bedtools2/2.31.1	Collectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic: that is, set theory on the genome.	https://github.com/arq5x/bedtools2
blast-plus/2.16.0	Basic Local Alignment Search Tool.	https://blast.ncbi.nlm.nih.gov/
boost/1.86.0	Boost provides free peer-reviewed portable C++ source libraries, emphasizing libraries that work well with the C++ Standard Library.	https://www.boost.org
bowtie/1.3.1	Bowtie is an ultrafast, memory-efficient short read aligner for short DNA sequences (reads) from next-gen sequencers.	https://sourceforge.net/projects/bowtie-bio/
bowtie2/2.5.4	Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences	https://bowtie-bio.sourceforge.net/bowtie2/index.shtml
bwa/0.7.17	Burrow-Wheeler Aligner for pairwise alignment between DNA sequences.	https://github.com/lh3/bwa
cdo/2.4.4	CDO is a collection of command line operators to manipulate and analyse Climate and NWP model Data.	https://code.mpimet.mpg.de/projects/cdo
chimera/1.18	UCSF CHIMERA: an Extensible Molecular Modeling System	https://www.cgl.ucsf.edu/chimera/
chimerax/1.9	UCSF ChimeraX (or simply ChimeraX) is the next-generation molecular visualization program from the Resource for Biocomputing, Visualization, and Informatics (RBVI), following UCSF Chimera.	https://www.cgl.ucsf.edu/chimerax/
clblast/1.5.2	CLBlast is a modern, lightweight, performant and tunable OpenCL BLAS library written in C++11. It is designed to leverage the full performance potential of a wide variety of OpenCL devices from different vendors, including desktop and laptop GPUs, embedded GPUs, and other accelerators. CLBlast implements BLAS routines: basic linear algebra subprograms operating on vectors and matrices.	https://cnugteren.github.io/clblast/clblast.html
clinfo/3.0.23.01.25	Print all known information about all available OpenCL platforms and devices in the system.	https://github.com/Oblomov/clinfo
clpeak/1.1.2	Simple OpenCL performance benchmark tool.	https://github.com/krrishnarraj/clpeak
cmake/3.30.5	A cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software.	https://www.cmake.org
cp2k/2023.2 cp2k/2024.1 cp2k/2025.1	CP2K is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems	https://www.cp2k.org
cpmd/4.3	The CPMD code is a parallelized plane wave / pseudopotential implementation of Density Functional Theory, particularly designed for ab-initio molecular dynamics.	https://www.cpmd.org/wordpress/
crest/2.12	Conformer-Rotamer Ensemble Sampling Tool	https://github.com/crest-lab/crest
diamond/2.1.10	DIAMOND is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data.	https://ab.inf.uni-tuebingen.de/software/diamond
dynamo/1.1.552	Dynamo is a software environment for subtomogram averaging of cryo-EM data.	https://www.dynamo-em.org/w/index.php?title=Main_Page
eccodes/2.34.0	ecCodes is a package developed by ECMWF for processing meteorological data in GRIB (1/2), BUFR (3/4) and GTS header formats.	https://software.ecmwf.int/wiki/display/ECC/ecCodes+Home
exciting/oxygen	exciting is a full-potential all-electron density-functional-theory package implementing the families of linearized augmented planewave methods. It can be applied to all kinds of materials, irrespective of the atomic species involved, and also allows for exploring the physics of core electrons. A particular focus are excited states within many-body perturbation theory.	https://exciting-code.org/
ffmpeg/7.0.2	FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video.	https://ffmpeg.org
fftw/3.3.10 fftw/3.3.10-quad-precision	FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST). We believe that FFTW, which is free software, should become the FFT library of choice for most applications.	https://www.fftw.org
fiji/2.16.0	Fiji is a ‘batteries-included’ distribution of ImageJ, a popular, free scientific image processing application which includes a lot of plugins organized into a coherent menu structure. Fiji compares to ImageJ as Ubuntu compares to Linux.	https://imagej.net/
fleur/5.1	FLEUR (Full-potential Linearised augmented plane wave in EURope) is a code family for calculating groundstate as well as excited-state properties of solids within the context of density functional theory (DFT).	https://www.flapw.de/MaX-5.1
flex/2.6.1	Flex is a tool for generating scanners.	https://github.com/westes/flex
fribidi/1.0.12	GNU FriBidi: The Free Implementation of the Unicode Bidirectional Algorithm.	https://github.com/fribidi/fribidi
gatk/4.5.0.0	Genome Analysis Toolkit Variant Discovery in High-Throughput Sequencing Data	https://gatk.broadinstitute.org/hc/en-us
gdal/3.10.0	GDAL: Geospatial Data Abstraction Library.	https://www.gdal.org/
gdb/15.2	GDB, the GNU Project debugger, allows you to see what is going on ‘inside’ another program while it executes – or what another program was doing at the moment it crashed.	https://www.gnu.org/software/gdb
globalarrays/5.8.2	Global Arrays (GA) is a Partitioned Global Address Space (PGAS) programming model.	https://hpc.pnl.gov/globalarrays/
gmp/6.3.0	GMP is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating-point numbers.	https://gmplib.org
gobject-introspection/1.78.1	The GObject Introspection is used to describe the program APIs and collect them in a uniform, machine readable format.Cairo is a 2D graphics library with support for multiple output	https://wiki.gnome.org/Projects/GObjectIntrospection
gromacs/2023-plumed gromacs/2024.3	GROMACS is a molecular dynamics package primarily designed for simulations of proteins, lipids and nucleic acids. It was originally developed in the Biophysical Chemistry department of University of Groningen, and is now maintained by contributors in universities and research centers across the world.	https://www.gromacs.org
gsl/2.8	The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. It is free software under the GNU General Public License. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.	https://www.gnu.org/software/gsl
harfbuzz/10.0.1	The Harfbuzz package contains an OpenType text shaping engine.	https://github.com/harfbuzz/harfbuzz
hdf5/1.12.3 hdf5/1.14.5	HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of data types, and is designed for flexible and efficient I/O and for high volume and complex data.	https://support.hdfgroup.org
hisat2/2.2.1	HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (whole-genome, transcriptome, and exome sequencing data) against the general human population (as well as against a single reference genome).	https://daehwankimlab.github.io/hisat2/
hpl/2.3	HPL is a software package that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers. It can thus be regarded as a portable as well as freely available implementation of the High Performance Computing Linpack Benchmark.	https://www.netlib.org/benchmark/hpl/
imagej/1.54p	ImageJ is public domain software for processing and analyzing scientific images. It is written in Java, which allows it to run on many different platforms. For further information, see: The ImageJ website, the primary home of this project. The ImageJ wiki, a community-built knowledge base covering ImageJ and its derivatives and flavors, including ImageJ2, Fiji, and others. The ImageJ mailing list and Image.sc Forum for community support. The Contributing page of the ImageJ wiki for details on how to contribute.	https://www.imagej.net
imod/5.0.2 imod/5.1.0	IMOD is a set of image processing, modeling and display programs used for tomographic reconstruction and for 3D reconstruction of EM serial sections and optical sections. The package contains tools for assembling and aligning data within multiple types and sizes of image stacks, viewing 3-D data from any orientation, and modeling and display of the image files. IMOD was developed primarily by David Mastronarde, Rick Gaudette, Sue Held, Jim Kremer, Quanren Xiong, Suraj Khochare, and John Heumann at the University of Colorado.	https://bio3d.colorado.edu/imod/
intel-oneapi-advisor/2025.0.0	Intel Advisor is a design and analysis tool for developing performant code. The tool supports C, C++, Fortran, SYCL, OpenMP, OpenCL code, and Python. It helps with the following: Performant CPU Code: Design your application for efficient threading, vectorization, and memory use. Efficient GPU Offload: Identify parts of the code that can be profitably offloaded. Optimize the code for compute and memory.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/advisor.html
intel-oneapi-dal/2025.0.0	Intel oneAPI Data Analytics Library (oneDAL) is a library that helps speed up big data analysis by providing highly optimized algorithmic building blocks for all stages of data analytics (preprocessing, transformation, analysis, modeling, validation, and decision making) in batch, online, and distributed processing modes of computation. The library optimizes data ingestion along with algorithmic computation to increase throughput and scalability.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html
intel-oneapi-dnn/2025.0.0	The Intel oneAPI Deep Neural Network Library (oneDNN) helps developers improve productivity and enhance the performance of their deep learning frameworks. It supports key data type formats, including 16 and 32-bit floating point, bfloat16, and 8-bit integers and implements rich operators, including convolution, matrix multiplication, pooling, batch normalization, activation functions, recurrent neural network (RNN) cells, and long short-term memory (LSTM) cells.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onednn.html
intel-oneapi-inspector/2024.1.0	Intel Inspector is a dynamic memory and threading error debugger for C, C++, and Fortran applications that run on Windows and Linux operating systems. Save money: locate the root cause of memory, threading, and persistence errors before you release. Save time: simplify the diagnosis of difficult errors by breaking into the debugger just before the error occurs. Save effort: use your normal debug or production build to catch and debug errors. Check all code, including third-party libraries with unavailable sources.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/inspector.html
intel-oneapi-mkl/2024.2.2	Intel oneAPI Math Kernel Library (Intel oneMKL; formerly Intel Math Kernel Library or Intel MKL), is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html
intel-oneapi-tbb/2022.0.0	Intel oneAPI Threading Building Blocks (oneTBB) is a flexible performance library that simplifies the work of adding parallelism to complex applications across accelerated architectures, even if you are not a threading expert.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onetbb.html
intel-oneapi-vtune/2025.0.0	Intel VTune Profiler is a profiler to optimize application performance, system performance, and system configuration for HPC, cloud, IoT, media, storage, and more. CPU, GPU, and FPGA: Tune the entire application’s performance–not just the accelerated portion. Multilingual: Profile SYCL, C, C++, C#, Fortran, OpenCL code, Python, Google Go programming language, Java, .NET, Assembly, or any combination of languages. System or Application: Get coarse-grained system data for an extended period or detailed results mapped to source code. Power: Optimize performance while avoiding power and thermal-related throttling.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html
iq-tree/2.3.2	IQ-TREE Efficient software for phylogenomic inference	http://www.iqtree.org
jags/4.3.2	JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation not wholly unlike BUGS	https://mcmc-jags.sourceforge.net/
jellyfish/2.3.1	JELLYFISH is a tool for fast, memory-efficient counting of k-mers in DNA.	https://www.cbcb.umd.edu/software/jellyfish/
lammps/20230802.4 lammps/20230802.4-plumed lammps/20240829.1 lammps/20240829.1-plumed	LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel Simulator.	https://www.lammps.org/
libaec/1.0.6	Libaec provides fast lossless compression of 1 up to 32 bit wide signed or unsigned integers (samples). It implements Golomb-Rice compression method under the BSD license and includes a free drop-in replacement for the SZIP library.	https://gitlab.dkrz.de/k202009/libaec
libfabric/2.0.0	The Open Fabrics Interfaces (OFI) is a framework focused on exporting fabric communication services to applications.	https://libfabric.org/
libffi/3.4.6	The libffi library provides a portable, high level programming interface to various calling conventions. This allows a programmer to call any function specified by a call interface description at run time.	https://sourceware.org/libffi/
libpng/1.6.39	libpng is the official PNG reference library.	http://www.libpng.org/pub/png/libpng.html
libxml2/2.13.4	Libxml2 is the XML C parser and toolkit developed for the Gnome project (but usable outside of the Gnome platform), it is free software available under the MIT License.	https://gitlab.gnome.org/GNOME/libxml2/-/wikis/home
libxsmm/1.17	Library for specialized dense and sparse matrix operations, and deep learning primitives.	https://github.com/hfp/libxsmm
meson/1.5.1	Meson is a portable open source build system meant to be both extremely fast, and as user friendly as possible.	https://mesonbuild.com/
metis/5.1.0	METIS is a set of serial programs for partitioning graphs, partitioning finite element meshes, and producing fill reducing orderings for sparse matrices.	https://github.com/KarypisLab/METIS
mono/6.12.0.122	Mono is a software platform designed to allow developers to easily create cross platform applications. It is an open source implementation of Microsoft’s .NET Framework based on the ECMA standards for C# and the Common Language Runtime.	https://www.mono-project.com/
mpfr/4.2.1	The MPFR library is a C library for multiple-precision floating-point computations with correct rounding.	https://www.mpfr.org/
mumps/5.7.3	MUMPS: a MUltifrontal Massively Parallel sparse direct Solver	https://mumps-solver.org
muscle5/5.1.0	MUSCLE is widely-used software for making multiple alignments of biological sequences.	https://drive5.com/muscle5/
nco/5.2.4	The NCO toolkit manipulates and analyzes data stored in netCDF-accessible formats	https://nco.sourceforge.net/
ncview/2.1.9	Simple viewer for NetCDF files.	https://cirrus.ucsd.edu/ncview/
netcdf-c/4.9.2	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the C distribution.	https://www.unidata.ucar.edu/software/netcdf
netcdf-fortran/4.6.1	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the Fortran distribution.	https://www.unidata.ucar.edu/software/netcdf
netgen/5.3.1	NETGEN is an automatic 3d tetrahedral mesh generator. It accepts input from constructive solid geometry (CSG) or boundary representation (BRep) from STL file format. The connection to a geometry kernel allows the handling of IGES and STEP files. NETGEN contains modules for mesh optimization and hierarchical mesh refinement.	https://ngsolve.org/
netlib-lapack/3.11.0	LAPACK version 3.X is a comprehensive FORTRAN library that does linear algebra operations including matrix inversions, least squared solutions to linear sets of equations, eigenvector analysis, singular value decomposition, etc. It is a very comprehensive and reputable package that has found extensive use in the scientific community.	https://www.netlib.org/lapack/
netlib-scalapack/2.2.0	ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines	https://www.netlib.org/scalapack/
nextflow/24.10.0	Data-driven computational pipelines.	https://www.nextflow.io
ninja/1.12.1	Ninja is a small build system with a focus on speed. It differs from other build systems in two major respects: it is designed to have its input files generated by a higher-level build system, and it is designed to run builds as fast as possible.	https://ninja-build.org/
ocl-icd/2.3.2	This package aims at creating an Open Source alternative to vendor specific OpenCL ICD loaders.	https://github.com/OCL-dev/ocl-icd
octave/9.1.0	GNU Octave is a high-level language, primarily intended for numerical computations.	https://www.gnu.org/software/octave/
openbabel/3.1.1	Open Babel is a chemical toolbox designed to speak the many languages of chemical data. It’s an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, solid-state materials, biochemistry, or related areas.	https://openbabel.org/docs/index.html
openblas/0.3.28	OpenBLAS: An optimized BLAS library	https://www.openblas.net
opencl-c-headers/2024.05.08	OpenCL (Open Computing Language) C header files	https://www.khronos.org/registry/OpenCL/
opencl-clhpp/2.0.16	C++ headers for OpenCL development	https://www.khronos.org/registry/OpenCL/
opencl-headers/3.0	Bundled OpenCL (Open Computing Language) header files	https://www.khronos.org/registry/OpenCL/
opencoarrays/2.10.2	OpenCoarrays is an open-source software project that produces an application binary interface (ABI) supporting coarray Fortran (CAF) compilers, an application programming interface (API) that supports users of non-CAF compilers, and an associated compiler wrapper and program launcher.	http://www.opencoarrays.org/
openfast/3.5.5	Wind turbine simulation package from NREL	https://openfast.readthedocs.io/
openfoam/2406 openfoam/2412	OpenFOAM is a GPL-open-source C++ CFD-toolbox. This offering is supported by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark. OpenCFD Ltd has been developing and releasing OpenFOAM since its debut in 2004.	https://www.openfoam.com/
openfoam-org/10 openfoam-org/11	OpenFOAM is a GPL-open-source C++ CFD-toolbox. The openfoam.org release is managed by the OpenFOAM Foundation Ltd as a licensee of the OPENFOAM trademark. This offering is not approved or endorsed by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark.	https://www.openfoam.org/
openmpi/4.1.7 openmpi/5.0.6	An open source Message Passing Interface implementation.	https://www.open-mpi.org
osu-micro-benchmarks/7.5	The Ohio MicroBenchmark suite is a collection of independent MPI message passing performance microbenchmarks developed and written at The Ohio State University. It includes traditional benchmarks and performance measures such as latency, bandwidth and host overhead and can be used for both traditional and GPU-enhanced nodes.	https://mvapich.cse.ohio-state.edu/benchmarks/
paraview/5.13.2 paraview/5.13.2-gui	ParaView is an open-source, multi-platform data analysis and visualization application. This package includes the Catalyst in-situ library for versions 5.7 and greater, otherwise use the catalyst package.	https://www.paraview.org
parmetis/4.0.3	ParMETIS is an MPI-based parallel library that implements a variety of algorithms for partitioning unstructured graphs, meshes, and for computing fill-reducing orderings of sparse matrices.	https://github.com/KarypisLab/ParMETIS
patchelf/0.17.2	PatchELF is a small utility to modify the dynamic linker and RPATH of ELF executables.	https://nixos.org/patchelf.html
pbmpi/1.9	A Bayesian software for phylogenetic reconstruction using mixture models	https://github.com/bayesiancook/pbmpi
picard/3.3.0	Picard is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.	https://broadinstitute.github.io/picard/
pigz/2.8	A parallel implementation of gzip for modern multi-processor, multi-core machines.	https://zlib.net/pigz/
plink/1.9-beta7.7	PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.	https://www.cog-genomics.org/plink/1.9/
pocl/5.0	Portable Computing Language (pocl) is an open source implementation of the OpenCL standard which can be easily adapted for new targets and devices, both for homogeneous CPU and heterogeneous GPUs/accelerators.	https://portablecl.org
proj/9.4.1	PROJ is a generic coordinate transformation software, that transforms geospatial coordinates from one coordinate reference system (CRS) to another. This includes cartographic projections as well as geodetic transformations.	https://proj.org/
psi4/1.9.1	Psi4 is an open-source suite of ab initio quantum chemistry programs designed for efficient, high-accuracy simulations of a variety of molecular properties.	https://www.psicode.org/
py-uv/0.4.27	An extremely fast Python package and project manager, written in Rust.	https://github.com/astral-sh/uv
python/3.11.9	The Python programming language.	https://www.python.org/
qt/5.15.15	Qt is a comprehensive cross-platform C++ application framework.	https://qt.io
quantum-espresso/7.4	Quantum ESPRESSO is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.	https://quantum-espresso.org
r/4.4.1	R is ‘GNU S’, a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc. Please consult the R project homepage for further information.	https://www.r-project.org
raxml-ng/1.1.0 raxml-ng/1.2.2	RAxML-NG is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion.	https://github.com/amkozlov/raxml-ng/wiki
repeatmasker/4.1.5	RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.	https://www.repeatmasker.org
revbayes/1.2.2	Bayesian phylogenetic inference using probabilistic graphical models and an interpreted language.	https://revbayes.github.io
salmon/1.10.3	Salmon is a tool for quantifying the expression of transcripts using RNA-seq data.	https://combine-lab.github.io/salmon/
samtools/1.19.2 samtools/1.21	SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format	https://www.htslib.org
scala/2.13.14	Scala is a general-purpose programming language providing support for functional programming and a strong static type system. Designed to be concise, many of Scala’s design decisions were designed to build from criticisms of Java.	https://www.scala-lang.org/
scalasca/2.6.1	Scalasca is a software tool that supports the performance optimization of parallel programs by measuring and analyzing their runtime behavior. The analysis identifies potential performance bottlenecks - in particular those concerning communication and synchronization - and offers guidance in exploring their causes.	https://www.scalasca.org
scorep/8.4	The Score-P measurement infrastructure is a highly scalable and easy-to-use tool suite for profiling, event tracing, and online analysis of HPC applications.	https://www.vi-hps.org/projects/score-p
scotch/7.0.4	Scotch is a software package for graph and mesh/hypergraph partitioning, graph clustering, and sparse matrix ordering.	https://gitlab.inria.fr/scotch/scotch
star/2.7.11b	STAR is an ultrafast universal RNA-seq aligner.	https://github.com/alexdobin/STAR
subread/2.0.6	The Subread software package is a tool kit for processing next-gen sequencing data.	https://subread.sourceforge.net/
superlu-dist/9.1.0	A general purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations on high performance machines.	https://crd-legacy.lbl.gov/~xiaoye/SuperLU/
tk/8.6.11	Tk is a graphical user interface toolkit that takes developing desktop applications to a higher level than conventional approaches. Tk is the standard GUI not only for Tcl, but for many other dynamic languages, and can produce rich, native applications that run unchanged across Windows, Mac OS X, Linux and more.	https://www.tcl.tk
transabyss/2.0.1	de novo assembly of RNA-Seq data using ABySS
ucx/1.18.0	a communication library implementing high-performance messaging for MPI/PGAS frameworks	https://www.openucx.org
udunits/2.2.28	Automated units conversion	https://www.unidata.ucar.edu/software/udunits
valgrind/3.23.0	An instrumentation framework for building dynamic analysis.	https://valgrind.org/
vcftools/0.1.16	VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.	https://vcftools.github.io/
xtb/6.6.0	Semiempirical extended tight binding program package	https://xtb-docs.readthedocs.org
yambo/5.2.4 yambo/5.2.4-dp yambo/5.3.0 yambo/5.3.0-dp	Yambo is a FORTRAN/C code for Many-Body calculations in solid state and molecular physics.	https://www.yambo-code.org/

Emmy Intel Modules

It is required to load the appropriate compiler and MPI modules before any of the modules become visible (hierarchical module system):

Intel Compiler MPI Options

module load intel-oneapi-compilers/2025.0.0
module load intel-oneapi-mpi/2021.14.0

module load intel-oneapi-compilers/2025.0.0
module load openmpi/4.1.7

Tip

The software packages loaded on each phase are optimized for the particular CPU and GPU architecture (machine kind) of that phase (e.g Intel Cascadelake or Intel Sapphirerapids).

You can print the current machine kind by using the command: /sw/rev_profile/25.04/machine-kind

If you compile your own code (e.g. by using gcc or pip) please take care to compile on the same machine kind that the code will also be executed on.

List of Modules

Module Names	Description	Homepage
cdo/2.2.2 cdo/2.2.2-hdf5-1.10	CDO is a collection of command line operators to manipulate and analyse Climate and NWP model Data.	https://code.mpimet.mpg.de/projects/cdo
cmake/3.30.5	A cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software.	https://www.cmake.org
eccodes/2.38.0 eccodes/2.38.0-hdf5-1.10	ecCodes is a package developed by ECMWF for processing meteorological data in GRIB (1/2), BUFR (3/4) and GTS header formats.	https://software.ecmwf.int/wiki/display/ECC/ecCodes+Home
elk/10.6.11	An all-electron full-potential linearised augmented-plane wave (FP-LAPW) code with many advanced features.	https://elk.sourceforge.io/
fftw/3.3.10	FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST). We believe that FFTW, which is free software, should become the FFT library of choice for most applications.	https://www.fftw.org
flex/2.6.1	Flex is a tool for generating scanners.	https://github.com/westes/flex
gdb/15.2	GDB, the GNU Project debugger, allows you to see what is going on ‘inside’ another program while it executes – or what another program was doing at the moment it crashed.	https://www.gnu.org/software/gdb
globalarrays/5.8.2	Global Arrays (GA) is a Partitioned Global Address Space (PGAS) programming model.	https://hpc.pnl.gov/globalarrays/
hdf5/1.10.7 hdf5/1.10.7-precise-fp hdf5/1.12.3 hdf5/1.14.5 hdf5/1.14.5-precise-fp	HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of data types, and is designed for flexible and efficient I/O and for high volume and complex data.	https://support.hdfgroup.org
hpl/2.3	HPL is a software package that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers. It can thus be regarded as a portable as well as freely available implementation of the High Performance Computing Linpack Benchmark.	https://www.netlib.org/benchmark/hpl/
intel-oneapi-advisor/2025.0.0	Intel Advisor is a design and analysis tool for developing performant code. The tool supports C, C++, Fortran, SYCL, OpenMP, OpenCL code, and Python. It helps with the following: Performant CPU Code: Design your application for efficient threading, vectorization, and memory use. Efficient GPU Offload: Identify parts of the code that can be profitably offloaded. Optimize the code for compute and memory.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/advisor.html
intel-oneapi-compilers-llvm/2025.0.0	The internal LLVM components of the Intel oneAPI Compilers. Includes: clang, clang++, llvm-ar, llvm-profgen, …	https://software.intel.com/content/www/us/en/develop/tools/oneapi.html
intel-oneapi-dal/2025.0.0	Intel oneAPI Data Analytics Library (oneDAL) is a library that helps speed up big data analysis by providing highly optimized algorithmic building blocks for all stages of data analytics (preprocessing, transformation, analysis, modeling, validation, and decision making) in batch, online, and distributed processing modes of computation. The library optimizes data ingestion along with algorithmic computation to increase throughput and scalability.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html
intel-oneapi-debugger/2025.0.0	Intel® oneAPI Application Debugger (gdb-oneapi)	https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-gdb.html
intel-oneapi-dnn/2025.0.0	The Intel oneAPI Deep Neural Network Library (oneDNN) helps developers improve productivity and enhance the performance of their deep learning frameworks. It supports key data type formats, including 16 and 32-bit floating point, bfloat16, and 8-bit integers and implements rich operators, including convolution, matrix multiplication, pooling, batch normalization, activation functions, recurrent neural network (RNN) cells, and long short-term memory (LSTM) cells.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onednn.html
intel-oneapi-mkl/2024.2.2 intel-oneapi-mkl/2024.2.2-openmp	Intel oneAPI Math Kernel Library (Intel oneMKL; formerly Intel Math Kernel Library or Intel MKL), is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html
intel-oneapi-mpi/2021.14.0	Intel MPI Library is a multifabric message-passing library that implements the open-source MPICH specification. Use the library to create, maintain, and test advanced, complex applications that perform better on high-performance computing (HPC) clusters based on Intel processors.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/mpi-library.html
intel-oneapi-tbb/2022.0.0	Intel oneAPI Threading Building Blocks (oneTBB) is a flexible performance library that simplifies the work of adding parallelism to complex applications across accelerated architectures, even if you are not a threading expert.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onetbb.html
intel-oneapi-vtune/2025.0.0	Intel VTune Profiler is a profiler to optimize application performance, system performance, and system configuration for HPC, cloud, IoT, media, storage, and more. CPU, GPU, and FPGA: Tune the entire application’s performance–not just the accelerated portion. Multilingual: Profile SYCL, C, C++, C#, Fortran, OpenCL code, Python, Google Go programming language, Java, .NET, Assembly, or any combination of languages. System or Application: Get coarse-grained system data for an extended period or detailed results mapped to source code. Power: Optimize performance while avoiding power and thermal-related throttling.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html
libaec/1.0.6	Libaec provides fast lossless compression of 1 up to 32 bit wide signed or unsigned integers (samples). It implements Golomb-Rice compression method under the BSD license and includes a free drop-in replacement for the SZIP library.	https://gitlab.dkrz.de/k202009/libaec
libfabric/2.0.0	The Open Fabrics Interfaces (OFI) is a framework focused on exporting fabric communication services to applications.	https://libfabric.org/
libffi/3.4.6	The libffi library provides a portable, high level programming interface to various calling conventions. This allows a programmer to call any function specified by a call interface description at run time.	https://sourceware.org/libffi/
libxml2/2.13.4	Libxml2 is the XML C parser and toolkit developed for the Gnome project (but usable outside of the Gnome platform), it is free software available under the MIT License.	https://gitlab.gnome.org/GNOME/libxml2/-/wikis/home
meson/1.5.1	Meson is a portable open source build system meant to be both extremely fast, and as user friendly as possible.	https://mesonbuild.com/
metis/5.1.0	METIS is a set of serial programs for partitioning graphs, partitioning finite element meshes, and producing fill reducing orderings for sparse matrices.	https://github.com/KarypisLab/METIS
mumps/5.7.3	MUMPS: a MUltifrontal Massively Parallel sparse direct Solver	https://mumps-solver.org
nco/5.1.6 nco/5.1.6-hdf5-1.10	The NCO toolkit manipulates and analyzes data stored in netCDF-accessible formats	https://nco.sourceforge.net/
ncview/2.1.9	Simple viewer for NetCDF files.	https://cirrus.ucsd.edu/ncview/
netcdf-c/4.9.2 netcdf-c/4.9.2-precise-fp	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the C distribution.	https://www.unidata.ucar.edu/software/netcdf
netcdf-fortran/4.5.3-hdf5-1.10 netcdf-fortran/4.5.3-hdf5-1.10-precise-fp netcdf-fortran/4.6.1 netcdf-fortran/4.6.1-precise-fp	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the Fortran distribution.	https://www.unidata.ucar.edu/software/netcdf
ninja/1.12.1	Ninja is a small build system with a focus on speed. It differs from other build systems in two major respects: it is designed to have its input files generated by a higher-level build system, and it is designed to run builds as fast as possible.	https://ninja-build.org/
openmpi/4.1.7	An open source Message Passing Interface implementation.	https://www.open-mpi.org
osu-micro-benchmarks/7.3	The Ohio MicroBenchmark suite is a collection of independent MPI message passing performance microbenchmarks developed and written at The Ohio State University. It includes traditional benchmarks and performance measures such as latency, bandwidth and host overhead and can be used for both traditional and GPU-enhanced nodes.	https://mvapich.cse.ohio-state.edu/benchmarks/
parmetis/4.0.3	ParMETIS is an MPI-based parallel library that implements a variety of algorithms for partitioning unstructured graphs, meshes, and for computing fill-reducing orderings of sparse matrices.	https://github.com/KarypisLab/ParMETIS
petsc/3.20.1-complex petsc/3.20.1-real	PETSc is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations.	https://petsc.org
pigz/2.8	A parallel implementation of gzip for modern multi-processor, multi-core machines.	https://zlib.net/pigz/
quantum-espresso/7.4	Quantum ESPRESSO is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.	https://quantum-espresso.org
rsync/3.4.1	An open source utility that provides fast incremental file transfer.	https://rsync.samba.org
scotch/7.0.4	Scotch is a software package for graph and mesh/hypergraph partitioning, graph clustering, and sparse matrix ordering.	https://gitlab.inria.fr/scotch/scotch
ucx/1.18.0	a communication library implementing high-performance messaging for MPI/PGAS frameworks	https://www.openucx.org
udunits/2.2.28	Automated units conversion	https://www.unidata.ucar.edu/software/udunits
valgrind/3.23.0	An instrumentation framework for building dynamic analysis.	https://valgrind.org/
vasp/6.4.3 vasp/6.5.0 vasp/6.5.1	The Vienna Ab initio Simulation Package (VASP) is a computer program for atomic scale materials modelling, e.g. electronic structure calculations and quantum-mechanical molecular dynamics, from first principles.	https://vasp.at
wannier90/3.1.0	Wannier90 calculates maximally-localised Wannier functions (MLWFs).	https://wannier.org

Grete Core Modules

These modules are loadable by default. No other modules have to be loaded first.

Tip

The software packages loaded on each phase are optimized for the particular CPU and GPU architecture (machine kind) of that phase (e.g AMD Rome + A100 or Intel Sapphirerapids + H100).

You can print the current machine kind by using the command: /sw/rev_profile/25.04/machine-kind

If you compile your own code (e.g. by using gcc or pip) please take care to compile on the same machine kind that the code will also be executed on.

List of Modules

Module Names	Description	Homepage
amduprof/5.1.701	AMD uProf (‘MICRO-prof’) is a software profiling analysis tool for x86 applications running on Windows, Linux and FreeBSD operating systems and provides event information unique to the AMD ‘Zen’-based processors and AMD Instinct(tm) MI Series accelerators. AMD uProf enables the developer to better understand the limiters of application performance and evaluate improvements.	https://www.amd.com/en/developer/uprof.html
amira/2022.1	Amira is a software platform for visualization, processing, and analysis of 3D and 4D data.
ansys/2023.1 ansys/2023.2 ansys/2024.1 ansys/2024.2 ansys/2025.2	Ansys offers a comprehensive software suite that spans the entire range of physics, providing access to virtually any field of engineering simulation that a design process requires.	https://www.ansys.com/
ant/1.10.14	Apache Ant is a Java library and command-line tool whose mission is to drive processes described in build files as targets and extension points dependent upon each other	https://ant.apache.org/
aocc/5.0.0	The AOCC compiler system is a high performance, production quality code generation tool. The AOCC environment provides various options to developers when building and optimizing C, C++, and Fortran applications targeting 32-bit and 64-bit Linux platforms. The AOCC compiler system offers a high level of advanced optimizations, multi-threading and processor support that includes global optimization, vectorization, inter-procedural analyses, loop transformations, and code generation. AMD also provides highly optimized libraries, which extract the optimal performance from each x86 processor core when utilized. The AOCC Compiler Suite simplifies and accelerates development and tuning for x86 applications.	https://www.amd.com/en/developer/aocc.html
apptainer/1.3.4 apptainer/1.4.1 apptainer/1.4.3	Apptainer is an open source container platform designed to be simple, fast, and secure. Many container platforms are available, but Apptainer is designed for ease-of-use on shared systems and in high performance computing (HPC) environments.	https://apptainer.org
autoconf/2.72	Autoconf – system configuration part of autotools	https://www.gnu.org/software/autoconf/
autoconf-archive/2023.02.20	The GNU Autoconf Archive is a collection of more than 500 macros for GNU Autoconf.	https://www.gnu.org/software/autoconf-archive/
automake/1.16.5	Automake – make file builder part of autotools	https://www.gnu.org/software/automake/
bat/0.24.0	A cat(1) clone with wings.	https://github.com/sharkdp/bat
bear/3.1.6	Bear is a tool that generates a compilation database for clang tooling from non-cmake build systems.	https://github.com/rizsotto/Bear
binutils/2.43.1	GNU binutils, which contain the linker, assembler, objdump and others	https://www.gnu.org/software/binutils/
bison/3.0.5 bison/3.8.2	Bison is a general-purpose parser generator that converts an annotated context-free grammar into a deterministic LR or generalized LR (GLR) parser employing LALR(1) parser tables.	https://www.gnu.org/software/bison/
cairo/1.16.0	Cairo is a 2D graphics library with support for multiple output devices.	https://www.cairographics.org/
cmake/3.30.5	A cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software.	https://www.cmake.org
comsol/6.1 comsol/6.2 comsol/6.3	COMSOL Multiphysics is a finite element analyzer, solver, and simulation software package for various physics and engineering applications, especially coupled phenomena and multiphysics.	https://www.comsol.com/
diffutils/3.10	GNU Diffutils is a package of several programs related to finding differences between files.	https://www.gnu.org/software/diffutils/
elfutils/0.191	elfutils is a collection of various binary tools such as eu-objdump, eu-readelf, and other utilities that allow you to inspect and manipulate ELF files. Refer to Table 5.Tools Included in elfutils for Red Hat Developer for a complete list of binary tools that are distributed with the Red Hat Developer Toolset version of elfutils.	https://fedorahosted.org/elfutils/
fd/10.2.0	A simple, fast and user-friendly alternative to ‘find’	https://github.com/sharkdp/fd
findutils/4.9.0	The GNU Find Utilities are the basic directory searching utilities of the GNU operating system.	https://www.gnu.org/software/findutils/
fish/3.7.1	fish is a smart and user-friendly command line shell for OS X, Linux, and the rest of the family.	https://fishshell.com/
flex/2.6.1	Flex is a tool for generating scanners.	https://github.com/westes/flex
font-util/1.4.1	X.Org font package creation/installation utilities and fonts.	https://cgit.freedesktop.org/xorg/font/util
fontconfig/2.15.0	Fontconfig is a library for configuring/customizing font access	https://www.freedesktop.org/wiki/Software/fontconfig/
freesurfer/7.4.1 freesurfer/8.0.0-1	Freesurfer is an open source software suite for processing and analyzing (human) brain MRI images.
freetype/2.13.2	FreeType is a freely available software library to render fonts. It is written in C, designed to be small, efficient, highly customizable, and portable while capable of producing high-quality output (glyph images) of most vector and bitmap font formats.	https://www.freetype.org/index.html
fribidi/1.0.12	GNU FriBidi: The Free Implementation of the Unicode Bidirectional Algorithm.	https://github.com/fribidi/fribidi
fzf/0.56.2	fzf is a general-purpose command-line fuzzy finder.	https://github.com/junegunn/fzf
gawk/5.3.1	If you are like many computer users, you would frequently like to make changes in various text files wherever certain patterns appear, or extract data from parts of certain lines while discarding the rest. To write a program to do this in a language such as C or Pascal is a time-consuming inconvenience that may take many lines of code. The job is easy with awk, especially the GNU implementation: gawk.	https://www.gnu.org/software/gawk/
gcc/11.5.0 gcc/13.2.0 gcc/13.2.0-nvptx	The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Ada, and Go, as well as libraries for these languages.	https://gcc.gnu.org
gettext/0.22.5	GNU internationalization (i18n) and localization (l10n) library.	https://www.gnu.org/software/gettext/
git/2.47.0	Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.	https://git-scm.com
git-lfs/3.5.1	Git LFS is a system for managing and versioning large files in association with a Git repository. Instead of storing the large files within the Git repository as blobs, Git LFS stores special ‘pointer files’ in the repository, while storing the actual file contents on a Git LFS server.	https://git-lfs.github.com
gmake/4.4.1	GNU Make is a tool which controls the generation of executables and other non-source files of a program from the program’s source files.	https://www.gnu.org/software/make/
go/1.23.2	The golang compiler and build environment	https://go.dev
gobject-introspection/1.78.1	The GObject Introspection is used to describe the program APIs and collect them in a uniform, machine readable format.Cairo is a 2D graphics library with support for multiple output	https://wiki.gnome.org/Projects/GObjectIntrospection
harfbuzz/10.0.1	The Harfbuzz package contains an OpenType text shaping engine.	https://github.com/harfbuzz/harfbuzz
igv/2.16.2	The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.	https://software.broadinstitute.org/software/igv/home
imagemagick/7.1.1-39	ImageMagick is a software suite to create, edit, compose, or convert bitmap images.	https://www.imagemagick.org
intel-oneapi-compilers/2023.2.4 intel-oneapi-compilers/2025.0.0	Intel oneAPI Compilers. Includes: icx, icpx, ifx, and ifort. Releases before 2024.0 include icc/icpc LICENSE INFORMATION: By downloading and using this software, you agree to the terms and conditions of the software license agreements at https://intel.ly/393CijO.	https://software.intel.com/content/www/us/en/develop/tools/oneapi.html
intel-oneapi-compilers-classic/2021.10.0	Relies on intel-oneapi-compilers to install the compilers, and configures modules for icc/icpc/ifort.	https://software.intel.com/content/www/us/en/develop/tools/oneapi.html
jacamar-ci/0.25.0	Jacamar CI is a HPC focused CI/CD driver for the GitLab custom executor.	https://gitlab.com/ecp-ci/jacamar-ci
jq/1.7.1	jq is a lightweight and flexible command-line JSON processor.	https://stedolan.github.io/jq/
julia/1.11.1 julia/1.11.6	The Julia Language: A fresh approach to technical computing This package installs the x86_64-linux-gnu version provided by Julia Computing	https://julialang.org/
lftp/4.9.2	LFTP is a sophisticated file transfer program supporting a number of network protocols (ftp, http, sftp, fish, torrent).	https://lftp.yar.ru/
libffi/3.4.6	The libffi library provides a portable, high level programming interface to various calling conventions. This allows a programmer to call any function specified by a call interface description at run time.	https://sourceware.org/libffi/
libiconv/1.17	GNU libiconv provides an implementation of the iconv() function and the iconv program for character set conversion.	https://www.gnu.org/software/libiconv/
libjpeg/9f	libjpeg is a widely used free library with functions for handling the JPEG image data format. It implements a JPEG codec (encoding and decoding) alongside various utilities for handling JPEG data.	http://www.ijg.org
libxml2/2.13.4	Libxml2 is the XML C parser and toolkit developed for the Gnome project (but usable outside of the Gnome platform), it is free software available under the MIT License.	https://gitlab.gnome.org/GNOME/libxml2/-/wikis/home
likwid/5.3.0	Likwid is a simple to install and use toolsuite of command line applications for performance oriented programmers. It works for Intel and AMD processors on the Linux operating system. This version uses the perf_event backend which reduces the feature set but allows user installs. See https://github.com/RRZE-HPC/likwid/wiki/TutorialLikwidPerf#feature-limitations for information.	https://hpc.fau.de/research/tools/likwid/
llvm/19.1.3	The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Despite its name, LLVM has little to do with traditional virtual machines, though it does provide helpful libraries that can be used to build them. The name ‘LLVM’ itself is not an acronym; it is the full name of the project.	https://llvm.org/
lz4/1.10.0	LZ4 is lossless compression algorithm, providing compression speed at 400 MB/s per core, scalable with multi-cores CPU. It also features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.	https://lz4.github.io/lz4/
m4/1.4.19	GNU M4 is an implementation of the traditional Unix macro processor.	https://www.gnu.org/software/m4/m4.html
mathematica/12.2.0	Mathematica: high-powered computation with thousands of Wolfram Language functions, natural language input, real-world data, mobile support.
matlab/R2021b matlab/R2022b matlab/R2023b matlab/R2024b	MATLAB (MATrix LABoratory) is a multi-paradigm numerical computing environment and fourth-generation programming language. A proprietary programming language developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages, including C, C++, C#, Java, Fortran and Python.
matlab-mcr/R2021b_Update_7 matlab-mcr/R2022b_Update_10 matlab-mcr/R2023b_Update_10 matlab-mcr/R2024b_Update_6	MATLAB Runtime runs compiled MATLAB applications or components without installing MATLAB. The MATLAB Runtime is a standalone set of shared libraries that enables the execution of compiled MATLAB applications or components. When used together, MATLAB, MATLAB Compiler, and the MATLAB Runtime enable you to create and distribute numerical applications or software components quickly and securely.
mercurial/6.7.3	Mercurial is a free, distributed source control management tool.	https://www.mercurial-scm.org
meson/1.5.1	Meson is a portable open source build system meant to be both extremely fast, and as user friendly as possible.	https://mesonbuild.com/
miniforge3/24.3.0-0	Miniforge3 is a minimal installer for conda and mamba specific to conda-forge.	https://github.com/conda-forge/miniforge
mkfontdir/1.0.7	mkfontdir creates the fonts.dir files needed by the legacy X server core font system. The current implementation is a simple wrapper script around the mkfontscale program, which must be built and installed first.	https://cgit.freedesktop.org/xorg/app/mkfontdir
mkfontscale/1.2.3	mkfontscale creates the fonts.scale and fonts.dir index files used by the legacy X11 font system.	https://gitlab.freedesktop.org/xorg/app/mkfontscale
mpfr/4.2.1	The MPFR library is a C library for multiple-precision floating-point computations with correct rounding.	https://www.mpfr.org/
ncurses/6.5	The ncurses (new curses) library is a free software emulation of curses in System V Release 4.0, and more. It uses terminfo format, supports pads and color and multiple highlights and forms characters and function-key mapping, and has all the other SYSV-curses enhancements over BSD curses.	https://invisible-island.net/ncurses/ncurses.html
ninja/1.12.1	Ninja is a small build system with a focus on speed. It differs from other build systems in two major respects: it is designed to have its input files generated by a higher-level build system, and it is designed to run builds as fast as possible.	https://ninja-build.org/
nvhpc/24.9 nvhpc/24.9-mpi	The NVIDIA HPC SDK is a comprehensive suite of compilers, libraries and tools essential to maximizing developer productivity and the performance and portability of HPC applications. The NVIDIA HPC SDK C, C++, and Fortran compilers support GPU acceleration of HPC modeling and simulation applications with standard C++ and Fortran, OpenACC directives, and CUDA. GPU-accelerated math libraries maximize performance on common HPC algorithms, and optimized communications libraries enable standards-based multi-GPU and scalable systems programming. Performance profiling and debugging tools simplify porting and optimization of HPC applications.	https://developer.nvidia.com/hpc-sdk
openjdk/17.0.11_9 openjdk/17.0.8.1_1	The free and open-source java implementation	https://openjdk.org/
parallel/20240822	GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input.	https://www.gnu.org/software/parallel/
patchelf/0.17.2	PatchELF is a small utility to modify the dynamic linker and RPATH of ELF executables.	https://nixos.org/patchelf.html
perl/5.40.0	Perl 5 is a highly capable, feature-rich programming language with over 27 years of development.	https://www.perl.org
perl-list-moreutils/0.430	Provide the stuff missing in List::Util	https://metacpan.org/pod/List::MoreUtils
perl-uri/5.08	Uniform Resource Identifiers (absolute and relative)	https://metacpan.org/pod/URI
perl-xml-libxml/2.0210	This module is an interface to libxml2, providing XML and HTML parsers with DOM, SAX and XMLReader interfaces, a large subset of DOM Layer 3 interface and a XML::XPath-like interface to XPath API of libxml2. The module is split into several packages which are not described in this section; unless stated otherwise, you only need to use XML::LibXML; in your programs.	https://metacpan.org/pod/XML::LibXML
pigz/2.8	A parallel implementation of gzip for modern multi-processor, multi-core machines.	https://zlib.net/pigz/
pkgconf/2.2.0	pkgconf is a program which helps to configure compiler and linker flags for development frameworks. It is similar to pkg-config from freedesktop.org, providing additional functionality while also maintaining compatibility.	http://pkgconf.org/
podman/5.5.0	An optionally rootless and daemonless container engine: alias docker=podman	https://podman.io
py-reportseff/2.7.6	A python script for tabular display of slurm efficiency information.	https://github.com/troycomi/reportseff
rclone/1.68.1	Rclone is a command line program to sync files and directories to and from various cloud storage providers	https://rclone.org
readline/8.2	The GNU Readline library provides a set of functions for use by applications that allow users to edit command lines as they are typed in. Both Emacs and vi editing modes are available. The Readline library includes additional functions to maintain a list of previously-entered command lines, to recall and perhaps reedit those lines, and perform csh-like history expansion on previous commands.	https://tiswww.case.edu/php/chet/readline/rltop.html
ripgrep/14.1.1	ripgrep is a line-oriented search tool that recursively searches your current directory for a regex pattern. ripgrep is similar to other popular search tools like The Silver Searcher, ack and grep.	https://github.com/BurntSushi/ripgrep
rust/1.70.0 rust/1.81.0 rust/1.85.0	The Rust programming language toolchain.	https://www.rust-lang.org
skopeo/0.1.40	skopeo is a command line utility that performs various operations on container images and image repositories.	https://github.com/containers/skopeo
spack/0.23.1	Spack is a multi-platform package manager that builds and installs multiple versions and configurations of software. It works on Linux, macOS, and many supercomputers. Spack is non-destructive: installing a new version of a package does not break existing installations, so many configurations of the same package can coexist.	https://spack.io/
spark/3.1.1 spark/3.5.1	Apache Spark is a fast and general engine for large-scale data processing.	https://spark.apache.org
sqlite/3.46.0	SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine.	https://www.sqlite.org
squashfuse/0.5.2	squashfuse - Mount SquashFS archives using Filesystem in USErspace (FUSE)	https://github.com/vasi/squashfuse
starccm/18.06.007 starccm/19.04.007 starccm/20.04.007	STAR-CCM+: Simcenter STAR-CCM+ is a multiphysics computational fluid dynamics (CFD) simulation software that enables CFD engineers to model the complexity and explore the possibilities of products operating under real-world conditions.	https://plm.sw.siemens.com/en-US/simcenter/fluids-thermal-simulation/star-ccm/
subversion/1.14.2	Apache Subversion - an open source version control system.	https://subversion.apache.org/
tar/1.34	GNU Tar provides the ability to create tar archives, as well as various other kinds of manipulation.	https://www.gnu.org/software/tar/
tcl/8.6.12	Tcl (Tool Command Language) is a very powerful but easy to learn dynamic programming language, suitable for a very wide range of uses, including web and desktop applications, networking, administration, testing and many more. Open source and business-friendly, Tcl is a mature yet evolving language that is truly cross platform, easily deployed and highly extensible.	https://www.tcl.tk/
tcsh/6.24.00	Tcsh is an enhanced but completely compatible version of csh, the C shell. Tcsh is a command language interpreter which can be used both as an interactive login shell and as a shell script command processor. Tcsh includes a command line editor, programmable word completion, spelling correction, a history mechanism, job control and a C language like syntax.	https://www.tcsh.org/
texinfo/7.1	Texinfo is the official documentation format of the GNU project.	https://www.gnu.org/software/texinfo/
tk/8.6.11	Tk is a graphical user interface toolkit that takes developing desktop applications to a higher level than conventional approaches. Tk is the standard GUI not only for Tcl, but for many other dynamic languages, and can produce rich, native applications that run unchanged across Windows, Mac OS X, Linux and more.	https://www.tcl.tk
tkdiff/5.7	TkDiff is a graphical front end to the diff program. It provides a side-by-side view of the differences between two text files, along with several innovative features such as diff bookmarks, a graphical map of differences for quick navigation, and a facility for slicing diff regions to achieve exactly the merge output desired.	https://tkdiff.sourceforge.io/
tmolex/2024.1 tmolex/2025	Turbomole package with TmoleX GUI (includes CLI tools).	https://www.turbomole.org/
tree/2.1.0	Tree is a recursive directory listing command that produces a depth indented listing of files, which is colorized ala dircolors if the LS_COLORS environment variable is set and output is to tty. Tree has been ported and reported to work under the following operating systems: Linux, FreeBSD, OS X, Solaris, HP/UX, Cygwin, HP Nonstop and OS/2.	http://mama.indstate.edu/users/ice/tree/
turbomole/7.8.1	TURBOMOLE: Program Package for ab initio Electronic Structure Calculations.	https://www.turbomole.org/
ucsc/2019-12-12	UCSC Genome Browser and Blat application binaries built for standalone command-line use.
uv/0.7.15	An extremely fast Python package and project manager, written in Rust.
valgrind/3.23.0	An instrumentation framework for building dynamic analysis.	https://valgrind.org/
vampir/10.5.0 vampir/10.7.0	Vampir and Score-P provide a performance tool framework with special focus on highly-parallel applications. Performance data is collected from multi-process (MPI, SHMEM), thread-parallel (OpenMP, Pthreads), as well as accelerator-based paradigms (CUDA, HIP, OpenCL, OpenACC).
vampirserver/10.5.0 vampirserver/10.7.0	Vampir and Score-P provide a performance tool framework with special focus on highly-parallel applications. Performance data is collected from multi-process (MPI, SHMEM), thread-parallel (OpenMP, Pthreads), as well as accelerator-based paradigms (CUDA, HIP, OpenCL, OpenACC).
vim/9.1.0437	Vim is a highly configurable text editor built to enable efficient text editing. It is an improved version of the vi editor distributed with most UNIX systems. Vim is often called a ‘programmer’s editor,’ and so useful for programming that many consider it an entire IDE. It’s not just for programmers, though. Vim is perfect for all kinds of text editing, from composing email to editing configuration files.	https://www.vim.org
vmd/1.9.3	VMD provides user-editable materials which can be applied to molecular geometry.	https://www.ks.uiuc.edu/Research/vmd/
which/2.21	GNU which - is a utility that is used to find which executable (or alias or shell function) is executed when entered on the shell prompt.	https://savannah.gnu.org/projects/which/
xz/5.4.6	XZ Utils is free general-purpose data compression software with high compression ratio. XZ Utils were written for POSIX-like systems, but also work on some not-so-POSIX systems. XZ Utils are the successor to LZMA Utils.	https://tukaani.org/xz/
zlib-ng/2.2.1	zlib replacement with optimizations for next generation systems.	https://github.com/zlib-ng/zlib-ng
zsh/5.8.1	Zsh is a shell designed for interactive use, although it is also a powerful scripting language. Many of the useful features of bash, ksh, and tcsh were incorporated into zsh; many original features were added.	https://www.zsh.org
zstd/1.5.6	Zstandard, or zstd as short version, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios.	https://facebook.github.io/zstd/

Grete GCC13 Modules

It is required to load the appropriate compiler and MPI modules before any of the modules become visible (hierarchical module system):

module load gcc/13.2.0
module load openmpi/5.0.7

Tip

The software packages loaded on each phase are optimized for the particular CPU and GPU architecture (machine kind) of that phase (e.g AMD Rome + A100 or Intel Sapphirerapids + H100).

You can print the current machine kind by using the command: /sw/rev_profile/25.04/machine-kind

If you compile your own code (e.g. by using gcc or pip) please take care to compile on the same machine kind that the code will also be executed on.

List of Modules

Module Names	Description	Homepage
abyss/2.3.10 abyss/2.3.5	ABySS is a de novo, parallel, paired-end sequence assembler that is designed for short reads. The single-processor version is useful for assembling genomes up to 100 Mbases in size.	https://www.bcgsc.ca/platform/bioinfo/software/abyss
amdblis/4.1	AMD Optimized BLIS.	https://www.amd.com/en/developer/aocl/blis.html
amdfftw/4.1	FFTW (AMD Optimized version) is a comprehensive collection of fast C routines for computing the Discrete Fourier Transform (DFT) and various special cases thereof.	https://www.amd.com/en/developer/aocl/fftw.html
amdlibflame/4.1	libFLAME (AMD Optimized version) is a portable library for dense matrix computations, providing much of the functionality present in Linear Algebra Package (LAPACK). It includes a compatibility layer, FLAPACK, which includes complete LAPACK implementation.	https://www.amd.com/en/developer/aocl/blis.html#libflame
amdscalapack/4.1	ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines. It depends on external libraries including BLAS and LAPACK for Linear Algebra computations.	https://www.amd.com/en/developer/aocl/scalapack.html
aria2/1.36.0 aria2/1.37.0	An ultra fast download utility	https://aria2.github.io
bcftools/1.16 bcftools/1.19	BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.	https://samtools.github.io/bcftools/
beast2/2.7.4	BEAST is a cross-platform program for Bayesian inference using MCMC of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology.	http://beast2.org/
bedops/2.4.41	BEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archiving, conversion and other management of genomic data of arbitrary scale.	https://bedops.readthedocs.io
blast-plus/2.14.1 blast-plus/2.16.0	Basic Local Alignment Search Tool.	https://blast.ncbi.nlm.nih.gov/
boost/1.83.0 boost/1.86.0	Boost provides free peer-reviewed portable C++ source libraries, emphasizing libraries that work well with the C++ Standard Library.	https://www.boost.org
bowtie/1.3.1	Bowtie is an ultrafast, memory-efficient short read aligner for short DNA sequences (reads) from next-gen sequencers.	https://sourceforge.net/projects/bowtie-bio/
bwa/0.7.17	Burrow-Wheeler Aligner for pairwise alignment between DNA sequences.	https://github.com/lh3/bwa
cdo/2.2.2 cdo/2.4.4	CDO is a collection of command line operators to manipulate and analyse Climate and NWP model Data.	https://code.mpimet.mpg.de/projects/cdo
charmpp/8.0.0-smp	Charm++ is a parallel programming framework in C++ supported by an adaptive runtime system, which enhances user productivity and allows programs to run portably from small multicore computers (your laptop) to the largest supercomputers.	https://charmplusplus.org
chimera/1.18	UCSF CHIMERA: an Extensible Molecular Modeling System	https://www.cgl.ucsf.edu/chimera/
chimerax/1.9	UCSF ChimeraX (or simply ChimeraX) is the next-generation molecular visualization program from the Resource for Biocomputing, Visualization, and Informatics (RBVI), following UCSF Chimera.	https://www.cgl.ucsf.edu/chimerax/
clblast/1.5.2	CLBlast is a modern, lightweight, performant and tunable OpenCL BLAS library written in C++11. It is designed to leverage the full performance potential of a wide variety of OpenCL devices from different vendors, including desktop and laptop GPUs, embedded GPUs, and other accelerators. CLBlast implements BLAS routines: basic linear algebra subprograms operating on vectors and matrices.	https://cnugteren.github.io/clblast/clblast.html
clinfo/3.0.23.01.25	Print all known information about all available OpenCL platforms and devices in the system.	https://github.com/Oblomov/clinfo
clpeak/1.1.2	Simple OpenCL performance benchmark tool.	https://github.com/krrishnarraj/clpeak
cp2k/2024.1 cp2k/2025.1	CP2K is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems	https://www.cp2k.org
cpmd/4.3	The CPMD code is a parallelized plane wave / pseudopotential implementation of Density Functional Theory, particularly designed for ab-initio molecular dynamics.	https://www.cpmd.org/wordpress/
crest/2.12	Conformer-Rotamer Ensemble Sampling Tool	https://github.com/crest-lab/crest
cuda/11.8.0 cuda/12.6.2	CUDA is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).	https://developer.nvidia.com/cuda-zone
cudnn/8.9.7.29-11 cudnn/8.9.7.29-12	NVIDIA cuDNN is a GPU-accelerated library of primitives for deep neural networks	https://developer.nvidia.com/cudnn
diamond/2.1.10 diamond/2.1.7	DIAMOND is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data.	https://ab.inf.uni-tuebingen.de/software/diamond
dynamo/1.1.552	Dynamo is a software environment for subtomogram averaging of cryo-EM data.	https://www.dynamo-em.org/w/index.php?title=Main_Page
eccodes/2.34.0	ecCodes is a package developed by ECMWF for processing meteorological data in GRIB (1/2), BUFR (3/4) and GTS header formats.	https://software.ecmwf.int/wiki/display/ECC/ecCodes+Home
exciting/oxygen	exciting is a full-potential all-electron density-functional-theory package implementing the families of linearized augmented planewave methods. It can be applied to all kinds of materials, irrespective of the atomic species involved, and also allows for exploring the physics of core electrons. A particular focus are excited states within many-body perturbation theory.	https://exciting-code.org/
ffmpeg/7.0.2	FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video.	https://ffmpeg.org
fftw/3.3.10 fftw/3.3.10-quad-precision	FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST). We believe that FFTW, which is free software, should become the FFT library of choice for most applications.	https://www.fftw.org
fiji/2.16.0	Fiji is a ‘batteries-included’ distribution of ImageJ, a popular, free scientific image processing application which includes a lot of plugins organized into a coherent menu structure. Fiji compares to ImageJ as Ubuntu compares to Linux.	https://imagej.net/
fleur/5.1 fleur/7.2	FLEUR (Full-potential Linearised augmented plane wave in EURope) is a code family for calculating groundstate as well as excited-state properties of solids within the context of density functional theory (DFT).	https://www.flapw.de/MaX-5.1
flex/2.6.1	Flex is a tool for generating scanners.	https://github.com/westes/flex
foam-extend/5.0	The Extend Project is a fork of the OpenFOAM open-source library for Computational Fluid Dynamics (CFD). This offering is not approved or endorsed by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark.	https://sourceforge.net/projects/foam-extend/
fribidi/1.0.12	GNU FriBidi: The Free Implementation of the Unicode Bidirectional Algorithm.	https://github.com/fribidi/fribidi
gatk/3.8.1 gatk/4.4.0.0 gatk/4.5.0.0	Genome Analysis Toolkit Variant Discovery in High-Throughput Sequencing Data	https://gatk.broadinstitute.org/hc/en-us
gdal/3.10.0 gdal/3.7.3	GDAL: Geospatial Data Abstraction Library.	https://www.gdal.org/
gdb/15.2	GDB, the GNU Project debugger, allows you to see what is going on ‘inside’ another program while it executes – or what another program was doing at the moment it crashed.	https://www.gnu.org/software/gdb
globalarrays/5.8.2	Global Arrays (GA) is a Partitioned Global Address Space (PGAS) programming model.	https://hpc.pnl.gov/globalarrays/
gmp/6.2.1 gmp/6.3.0	GMP is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating-point numbers.	https://gmplib.org
gobject-introspection/1.78.1	The GObject Introspection is used to describe the program APIs and collect them in a uniform, machine readable format.Cairo is a 2D graphics library with support for multiple output	https://wiki.gnome.org/Projects/GObjectIntrospection
grads/2.2.3	The Grid Analysis and Display System (GrADS) is an interactive desktop tool that is used for easy access, manipulation, and visualization of earth science data. GrADS has two data models for handling gridded and station data. GrADS supports many data file formats, including binary (stream or sequential), GRIB (version 1 and 2), NetCDF, HDF (version 4 and 5), and BUFR (for station data).	http://cola.gmu.edu/grads/grads.php
gromacs/2022.5-plumed gromacs/2023-plumed gromacs/2023.3 gromacs/2024.3	GROMACS is a molecular dynamics package primarily designed for simulations of proteins, lipids and nucleic acids. It was originally developed in the Biophysical Chemistry department of University of Groningen, and is now maintained by contributors in universities and research centers across the world.	https://www.gromacs.org
gsl/2.7.1 gsl/2.8	The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. It is free software under the GNU General Public License. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.	https://www.gnu.org/software/gsl
harfbuzz/10.0.1	The Harfbuzz package contains an OpenType text shaping engine.	https://github.com/harfbuzz/harfbuzz
hdf5/1.14.5	HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of data types, and is designed for flexible and efficient I/O and for high volume and complex data.	https://support.hdfgroup.org
hisat2/2.2.1	HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (whole-genome, transcriptome, and exome sequencing data) against the general human population (as well as against a single reference genome).	https://daehwankimlab.github.io/hisat2/
imagej/1.54p	ImageJ is public domain software for processing and analyzing scientific images. It is written in Java, which allows it to run on many different platforms. For further information, see: The ImageJ website, the primary home of this project. The ImageJ wiki, a community-built knowledge base covering ImageJ and its derivatives and flavors, including ImageJ2, Fiji, and others. The ImageJ mailing list and Image.sc Forum for community support. The Contributing page of the ImageJ wiki for details on how to contribute.	https://www.imagej.net
imod/5.0.2 imod/5.1.0	IMOD is a set of image processing, modeling and display programs used for tomographic reconstruction and for 3D reconstruction of EM serial sections and optical sections. The package contains tools for assembling and aligning data within multiple types and sizes of image stacks, viewing 3-D data from any orientation, and modeling and display of the image files. IMOD was developed primarily by David Mastronarde, Rick Gaudette, Sue Held, Jim Kremer, Quanren Xiong, Suraj Khochare, and John Heumann at the University of Colorado.	https://bio3d.colorado.edu/imod/
intel-oneapi-advisor/2025.0.0	Intel Advisor is a design and analysis tool for developing performant code. The tool supports C, C++, Fortran, SYCL, OpenMP, OpenCL code, and Python. It helps with the following: Performant CPU Code: Design your application for efficient threading, vectorization, and memory use. Efficient GPU Offload: Identify parts of the code that can be profitably offloaded. Optimize the code for compute and memory.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/advisor.html
intel-oneapi-dal/2025.0.0	Intel oneAPI Data Analytics Library (oneDAL) is a library that helps speed up big data analysis by providing highly optimized algorithmic building blocks for all stages of data analytics (preprocessing, transformation, analysis, modeling, validation, and decision making) in batch, online, and distributed processing modes of computation. The library optimizes data ingestion along with algorithmic computation to increase throughput and scalability.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html
intel-oneapi-dnn/2025.0.0	The Intel oneAPI Deep Neural Network Library (oneDNN) helps developers improve productivity and enhance the performance of their deep learning frameworks. It supports key data type formats, including 16 and 32-bit floating point, bfloat16, and 8-bit integers and implements rich operators, including convolution, matrix multiplication, pooling, batch normalization, activation functions, recurrent neural network (RNN) cells, and long short-term memory (LSTM) cells.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onednn.html
intel-oneapi-inspector/2024.1.0	Intel Inspector is a dynamic memory and threading error debugger for C, C++, and Fortran applications that run on Windows and Linux operating systems. Save money: locate the root cause of memory, threading, and persistence errors before you release. Save time: simplify the diagnosis of difficult errors by breaking into the debugger just before the error occurs. Save effort: use your normal debug or production build to catch and debug errors. Check all code, including third-party libraries with unavailable sources.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/inspector.html
intel-oneapi-mkl/2023.2.0 intel-oneapi-mkl/2024.2.2	Intel oneAPI Math Kernel Library (Intel oneMKL; formerly Intel Math Kernel Library or Intel MKL), is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html
intel-oneapi-tbb/2022.0.0	Intel oneAPI Threading Building Blocks (oneTBB) is a flexible performance library that simplifies the work of adding parallelism to complex applications across accelerated architectures, even if you are not a threading expert.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onetbb.html
intel-oneapi-vtune/2025.0.0	Intel VTune Profiler is a profiler to optimize application performance, system performance, and system configuration for HPC, cloud, IoT, media, storage, and more. CPU, GPU, and FPGA: Tune the entire application’s performance–not just the accelerated portion. Multilingual: Profile SYCL, C, C++, C#, Fortran, OpenCL code, Python, Google Go programming language, Java, .NET, Assembly, or any combination of languages. System or Application: Get coarse-grained system data for an extended period or detailed results mapped to source code. Power: Optimize performance while avoiding power and thermal-related throttling.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html
iq-tree/2.2.2.7 iq-tree/2.3.2	IQ-TREE Efficient software for phylogenomic inference	http://www.iqtree.org
jags/4.3.0 jags/4.3.2	JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation not wholly unlike BUGS	https://mcmc-jags.sourceforge.net/
kraken2/2.1.2	Kraken2 is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.	https://ccb.jhu.edu/software/kraken2/
libaec/1.0.6	Libaec provides fast lossless compression of 1 up to 32 bit wide signed or unsigned integers (samples). It implements Golomb-Rice compression method under the BSD license and includes a free drop-in replacement for the SZIP library.	https://gitlab.dkrz.de/k202009/libaec
libfabric/2.0.0	The Open Fabrics Interfaces (OFI) is a framework focused on exporting fabric communication services to applications.	https://libfabric.org/
libffi/3.4.6	The libffi library provides a portable, high level programming interface to various calling conventions. This allows a programmer to call any function specified by a call interface description at run time.	https://sourceware.org/libffi/
libpng/1.6.39	libpng is the official PNG reference library.	http://www.libpng.org/pub/png/libpng.html
libxml2/2.13.4	Libxml2 is the XML C parser and toolkit developed for the Gnome project (but usable outside of the Gnome platform), it is free software available under the MIT License.	https://gitlab.gnome.org/GNOME/libxml2/-/wikis/home
libxsmm/1.17	Library for specialized dense and sparse matrix operations, and deep learning primitives.	https://github.com/hfp/libxsmm
metis/5.1.0	METIS is a set of serial programs for partitioning graphs, partitioning finite element meshes, and producing fill reducing orderings for sparse matrices.	https://github.com/KarypisLab/METIS
molden/6.7 molden/7.3	A package for displaying Molecular Density from various Ab Initio packages	https://www.theochem.ru.nl/molden/
mono/6.12.0.122	Mono is a software platform designed to allow developers to easily create cross platform applications. It is an open source implementation of Microsoft’s .NET Framework based on the ECMA standards for C# and the Common Language Runtime.	https://www.mono-project.com/
mpc/1.3.1	Gnu Mpc is a C library for the arithmetic of complex numbers with arbitrarily high precision and correct rounding of the result.	https://www.multiprecision.org
mpifileutils/0.11.1	mpiFileUtils is a suite of MPI-based tools to manage large datasets, which may vary from large directory trees to large files. High-performance computing users often generate large datasets with parallel applications that run with many processes (millions in some cases). However those users are then stuck with single-process tools like cp and rm to manage their datasets. This suite provides MPI-based tools to handle typical jobs like copy, remove, and compare for such datasets, providing speedups of up to 20-30x.	https://github.com/hpc/mpifileutils
mumps/5.7.3	MUMPS: a MUltifrontal Massively Parallel sparse direct Solver	https://mumps-solver.org
muscle5/5.1.0	MUSCLE is widely-used software for making multiple alignments of biological sequences.	https://drive5.com/muscle5/
must/1.9.0	MUST detects usage errors of the Message Passing Interface (MPI) and reports them to the user. As MPI calls are complex and usage errors common, this functionality is extremely helpful for application developers that want to develop correct MPI applications. This includes errors that already manifest: segmentation faults or incorrect results as well as many errors that are not visible to the application developer or do not manifest on a certain system or MPI implementation.	https://www.i12.rwth-aachen.de/go/id/nrbe
namd/3.0.1-smp	NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.	https://www.ks.uiuc.edu/Research/namd/
nco/5.1.6 nco/5.2.4	The NCO toolkit manipulates and analyzes data stored in netCDF-accessible formats	https://nco.sourceforge.net/
ncview/2.1.9	Simple viewer for NetCDF files.	https://cirrus.ucsd.edu/ncview/
netcdf-c/4.9.2	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the C distribution.	https://www.unidata.ucar.edu/software/netcdf
netcdf-fortran/4.6.1	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the Fortran distribution.	https://www.unidata.ucar.edu/software/netcdf
netgen/5.3.1	NETGEN is an automatic 3d tetrahedral mesh generator. It accepts input from constructive solid geometry (CSG) or boundary representation (BRep) from STL file format. The connection to a geometry kernel allows the handling of IGES and STEP files. NETGEN contains modules for mesh optimization and hierarchical mesh refinement.	https://ngsolve.org/
netlib-lapack/3.11.0	LAPACK version 3.X is a comprehensive FORTRAN library that does linear algebra operations including matrix inversions, least squared solutions to linear sets of equations, eigenvector analysis, singular value decomposition, etc. It is a very comprehensive and reputable package that has found extensive use in the scientific community.	https://www.netlib.org/lapack/
netlib-scalapack/2.2.0	ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines	https://www.netlib.org/scalapack/
nextflow/23.10.0 nextflow/24.10.0	Data-driven computational pipelines.	https://www.nextflow.io
nvbandwidth/0.8	nvbandwidth: A tool for bandwidth measurements on NVIDIA GPUs.
nvtop/3.0.1	Nvtop stands for Neat Videocard TOP, a (h)top like task monitor for AMD and NVIDIA GPUS. It can handle multiple GPUs and print information about them in a htop familiar way	https://github.com/Syllo/nvtop
ocl-icd/2.3.2	This package aims at creating an Open Source alternative to vendor specific OpenCL ICD loaders.	https://github.com/OCL-dev/ocl-icd
octave/9.1.0	GNU Octave is a high-level language, primarily intended for numerical computations.	https://www.gnu.org/software/octave/
openbabel/3.1.1	Open Babel is a chemical toolbox designed to speak the many languages of chemical data. It’s an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, solid-state materials, biochemistry, or related areas.	https://openbabel.org/docs/index.html
openblas/0.3.24 openblas/0.3.28	OpenBLAS: An optimized BLAS library	https://www.openblas.net
opencl-c-headers/2024.05.08	OpenCL (Open Computing Language) C header files	https://www.khronos.org/registry/OpenCL/
opencl-clhpp/2.0.16	C++ headers for OpenCL development	https://www.khronos.org/registry/OpenCL/
opencl-headers/3.0	Bundled OpenCL (Open Computing Language) header files	https://www.khronos.org/registry/OpenCL/
opencoarrays/2.10.1 opencoarrays/2.10.2	OpenCoarrays is an open-source software project that produces an application binary interface (ABI) supporting coarray Fortran (CAF) compilers, an application programming interface (API) that supports users of non-CAF compilers, and an associated compiler wrapper and program launcher.	http://www.opencoarrays.org/
openfast/3.5.5	Wind turbine simulation package from NREL	https://openfast.readthedocs.io/
openfoam/2306 openfoam/2312 openfoam/2406 openfoam/2412	OpenFOAM is a GPL-open-source C++ CFD-toolbox. This offering is supported by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark. OpenCFD Ltd has been developing and releasing OpenFOAM since its debut in 2004.	https://www.openfoam.com/
openfoam-org/10 openfoam-org/11 openfoam-org/6 openfoam-org/7 openfoam-org/8 openfoam-org/9	OpenFOAM is a GPL-open-source C++ CFD-toolbox. The openfoam.org release is managed by the OpenFOAM Foundation Ltd as a licensee of the OPENFOAM trademark. This offering is not approved or endorsed by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark.	https://www.openfoam.org/
openmpi/5.0.7	An open source Message Passing Interface implementation.	https://www.open-mpi.org
osu-micro-benchmarks/7.3 osu-micro-benchmarks/7.5	The Ohio MicroBenchmark suite is a collection of independent MPI message passing performance microbenchmarks developed and written at The Ohio State University. It includes traditional benchmarks and performance measures such as latency, bandwidth and host overhead and can be used for both traditional and GPU-enhanced nodes.	https://mvapich.cse.ohio-state.edu/benchmarks/
paraview/5.13.2 paraview/5.13.2-gui	ParaView is an open-source, multi-platform data analysis and visualization application. This package includes the Catalyst in-situ library for versions 5.7 and greater, otherwise use the catalyst package.	https://www.paraview.org
parmetis/4.0.3	ParMETIS is an MPI-based parallel library that implements a variety of algorithms for partitioning unstructured graphs, meshes, and for computing fill-reducing orderings of sparse matrices.	https://github.com/KarypisLab/ParMETIS
pbmpi/1.9	A Bayesian software for phylogenetic reconstruction using mixture models	https://github.com/bayesiancook/pbmpi
petsc/3.22.1-complex petsc/3.22.1-real	PETSc is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations.	https://petsc.org
pigz/2.8	A parallel implementation of gzip for modern multi-processor, multi-core machines.	https://zlib.net/pigz/
plink/1.9-beta7.7	PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.	https://www.cog-genomics.org/plink/1.9/
pocl/5.0	Portable Computing Language (pocl) is an open source implementation of the OpenCL standard which can be easily adapted for new targets and devices, both for homogeneous CPU and heterogeneous GPUs/accelerators.	https://portablecl.org
proj/9.2.1 proj/9.4.1	PROJ is a generic coordinate transformation software, that transforms geospatial coordinates from one coordinate reference system (CRS) to another. This includes cartographic projections as well as geodetic transformations.	https://proj.org/
psi4/1.8.2 psi4/1.9.1	Psi4 is an open-source suite of ab initio quantum chemistry programs designed for efficient, high-accuracy simulations of a variety of molecular properties.	https://www.psicode.org/
py-nvitop/1.4.0	An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.	https://nvitop.readthedocs.io/
py-uv/0.4.27	An extremely fast Python package and project manager, written in Rust.	https://github.com/astral-sh/uv
python/3.11.9	The Python programming language.	https://www.python.org/
qt/5.15.15	Qt is a comprehensive cross-platform C++ application framework.	https://qt.io
quantum-espresso/6.7 quantum-espresso/7.2 quantum-espresso/7.3.1 quantum-espresso/7.4	Quantum ESPRESSO is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.	https://quantum-espresso.org
r/4.4.0 r/4.4.1	R is ‘GNU S’, a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc. Please consult the R project homepage for further information.	https://www.r-project.org
raxml-ng/1.1.0 raxml-ng/1.2.2	RAxML-NG is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion.	https://github.com/amkozlov/raxml-ng/wiki
relion/4.0.1 relion/5.0.0	RELION (for REgularised LIkelihood OptimisatioN, pronounce rely-on) is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).	https://www2.mrc-lmb.cam.ac.uk/relion
repeatmasker/4.1.5	RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.	https://www.repeatmasker.org
repeatmodeler/2.0.4	RepeatModeler is a de-novo repeat family identification and modeling package.	https://github.com/Dfam-consortium/RepeatModeler
revbayes/1.2.2	Bayesian phylogenetic inference using probabilistic graphical models and an interpreted language.	https://revbayes.github.io
rsync/3.4.1	An open source utility that provides fast incremental file transfer.	https://rsync.samba.org
samtools/1.17 samtools/1.19.2	SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format	https://www.htslib.org
scala/2.13.1 scala/2.13.14	Scala is a general-purpose programming language providing support for functional programming and a strong static type system. Designed to be concise, many of Scala’s design decisions were designed to build from criticisms of Java.	https://www.scala-lang.org/
scalasca/2.6.1	Scalasca is a software tool that supports the performance optimization of parallel programs by measuring and analyzing their runtime behavior. The analysis identifies potential performance bottlenecks - in particular those concerning communication and synchronization - and offers guidance in exploring their causes.	https://www.scalasca.org
scorep/8.3 scorep/8.4	The Score-P measurement infrastructure is a highly scalable and easy-to-use tool suite for profiling, event tracing, and online analysis of HPC applications.	https://www.vi-hps.org/projects/score-p
scotch/7.0.4	Scotch is a software package for graph and mesh/hypergraph partitioning, graph clustering, and sparse matrix ordering.	https://gitlab.inria.fr/scotch/scotch
siesta/4.0.2 siesta/5.0.1	SIESTA performs electronic structure calculations and ab initio molecular dynamics simulations of molecules and solids.	https://departments.icmab.es/leem/siesta/
slepc/3.22.1	Scalable Library for Eigenvalue Problem Computations.	https://slepc.upv.es
snakemake/7.22.0 snakemake/8.18.2	Workflow management system to create reproducible and scalable data analyses.	https://snakemake.readthedocs.io/en
subread/2.0.6	The Subread software package is a tool kit for processing next-gen sequencing data.	https://subread.sourceforge.net/
tk/8.6.11	Tk is a graphical user interface toolkit that takes developing desktop applications to a higher level than conventional approaches. Tk is the standard GUI not only for Tcl, but for many other dynamic languages, and can produce rich, native applications that run unchanged across Windows, Mac OS X, Linux and more.	https://www.tcl.tk
transabyss/2.0.1	de novo assembly of RNA-Seq data using ABySS
ucx/1.18.0	a communication library implementing high-performance messaging for MPI/PGAS frameworks	https://www.openucx.org
udunits/2.2.28	Automated units conversion	https://www.unidata.ucar.edu/software/udunits
valgrind/3.20.0 valgrind/3.23.0	An instrumentation framework for building dynamic analysis.	https://valgrind.org/
vcftools/0.1.16	VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.	https://vcftools.github.io/
xtb/6.6.0	Semiempirical extended tight binding program package	https://xtb-docs.readthedocs.org
yambo/5.3.0 yambo/5.3.0-dp	Yambo is a FORTRAN/C code for Many-Body calculations in solid state and molecular physics.	https://www.yambo-code.org/

NHR Modules (nhr-lmod)

Warning

This software revision is no longer updated but can still be loaded and used by running the following command before any module commands:

export PREFERRED_SOFTWARE_STACK=nhr-lmod
source /etc/profile

This used to be the default software stack on the NHR part of the HPC system until May 2025. This stack uses Lmod as its module system. For the purposes of setting the desired software stack (see Software Stacks), its short name is nhr-lmod. You can learn more about how to use the module system at Module Basics.

To see the available software, run

module avail

Getting Started with nhr-lmod

On NHR this software stack is enabled by default. Just login to glogin-p2.hpc.gwdg.de, glogin-p3.hpc.gwdg.de or glogin-gpu.hpc.gwdg.de and use the module avail, module spider and module load commands.

Below we have provided some example scripts that load the gromacs module and run a simple test case. You can copy the example script and adjust it to the modules you would like to use.

KISSKI and REACT users can take the Grete example and use --partition kisski or --partition react instead.

Tutorial: Gromacs with nhr-lmod

The appropriate login nodes for this phase are glogin-p2.hpc.gwdg.de.

#!/bin/bash
#SBATCH --job-name="NHR-Emmy-P2-gromacs"
#SBATCH --output "slurm-%x-%j.out"
#SBATCH --error "slurm-%x-%j.err"
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 96
#SBATCH --partition standard96
#SBATCH --time 60:00

echo "================================ BATCH SCRIPT ================================" >&2
cat ${BASH_SOURCE[0]} >&2
echo "==============================================================================" >&2

module load gromacs/2023.3

export OMP_NUM_THREADS=1

source $(which GMXRC)
mpirun gmx_mpi mdrun -s /sw/chem/gromacs/mpinat-benchmarks/benchPEP.tpr \
	-nsteps 1000 -dlb yes -v

The appropriate login nodes for this phase are glogin-p3.hpc.gwdg.de.

#!/bin/bash
#SBATCH --job-name="NHR-Emmy-P3-gromacs"
#SBATCH --output "slurm-%x-%j.out"
#SBATCH --error "slurm-%x-%j.err"
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 96
#SBATCH --partition medium96s
#SBATCH --time 60:00

echo "================================ BATCH SCRIPT ================================" >&2
cat ${BASH_SOURCE[0]} >&2
echo "==============================================================================" >&2

module load gromacs/2023.3

export OMP_NUM_THREADS=1

source $(which GMXRC)
mpirun gmx_mpi mdrun -s /sw/chem/gromacs/mpinat-benchmarks/benchPEP.tpr \
	-nsteps 1000 -dlb yes -v

The appropriate login nodes for this phase are glogin-gpu.hpc.gwdg.de.

#!/bin/bash
#SBATCH --job-name="NHR-Grete-gromacs"
#SBATCH --output "slurm-%x-%j.out"
#SBATCH --error "slurm-%x-%j.err"
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 8
#SBATCH --gpus A100:4
#SBATCH --partition grete
#SBATCH --time 60:00

echo "================================ BATCH SCRIPT ================================" >&2
cat ${BASH_SOURCE[0]} >&2
echo "==============================================================================" >&2

module load gromacs/2023.3-cuda

# OpenMP Threads * MPI Ranks = CPU Cores
export OMP_NUM_THREADS=8
export GMX_ENABLE_DIRECT_GPU_COMM=1

source $(which GMXRC)
mpirun gmx_mpi mdrun -s /sw/chem/gromacs/mpinat-benchmarks/benchPEP-h.tpr \
	-nsteps 1000 -v -pme gpu -update gpu -bonded gpu -npme 1

The appropriate login nodes for this phase are glogin-gpu.hpc.gwdg.de.

Note

#!/bin/bash
#SBATCH --job-name="NHR-Grete-H100-gromacs"
#SBATCH --output "slurm-%x-%j.out"
#SBATCH --error "slurm-%x-%j.err"
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 8
#SBATCH --gpus H100:4
#SBATCH --partition grete-h100
#SBATCH --time 60:00

echo "================================ BATCH SCRIPT ================================" >&2
cat ${BASH_SOURCE[0]} >&2
echo "==============================================================================" >&2

module load gromacs/2023.3-cuda

# OpenMP Threads * MPI Ranks = CPU Cores
export OMP_NUM_THREADS=12
export GMX_ENABLE_DIRECT_GPU_COMM=1

source $(which GMXRC)
mpirun gmx_mpi mdrun -s /sw/chem/gromacs/mpinat-benchmarks/benchPEP-h.tpr \
	-nsteps 1000 -v -pme gpu -update gpu -bonded gpu -npme 1

Hierarchical Module System

The modules built with GCC 11.4.0 and Open MPI are visible by default in the current software revision. We will revisit this decision in 2025 and will probably reduce the number of Core modules significantly and require a module load openmpi before any modules using Open MPI can be loaded.

Supported Compiler - MPI Combinations for Release 24.05

Supported Combinations

This is the “Core” configuration and has the largest selection of software modules available.

module load gcc/11.4.0
module load openmpi/4.1.6
module avail

CUDA 11 is not fully compatible with GCC 13 - this compiler is not available on Grete.

module load gcc/13.2.0
module load openmpi/4.1.6
module avail

Do not use the generic compilers mpicc, mpicxx, mpifc, mpigcc, mpigxx, mpif77, and mpif90! The Intel MPI compilers are mpiicx, mpiicpx, and mpiifx for C, C++, and Fortran respectively. The classic compilers are deprecated and will be removed in the 2024/2025 versions. They are mpiicc, mpiicpc and mpiifort. It might be useful to set export SLURM_CPU_BIND=none when using Intel MPI.

module load intel-oneapi-compilers/2023.2.1
module load intel-oneapi-mpi/2021.10.0
module avail

OpenMPI will wrap around the modern Intel compilers icx (C), icpx (C++), and ifx (Fortran).

module load intel-oneapi-compilers/2023.2.1
module load openmpi/4.1.6
module avail

If a module is not available for your particular Compiler-MPI combination or you need a different compiler installed please contact HPC support.

Adding Your Own Modules

See Using Your Own Module Files.

Spack

Spack is provided as the spack module to help build your own software.

Migrating from HLRN Modules (hlrn-tmod)

For those migrating from HLRN Modules (hlrn-tmod), three important modules were replaced with slightly different modules (in addition to different versions) and several others have renames which are listed in the table below:

HLRN Modules Name	NHR Modules Name	Description
`intel`	`intel-oneapi-compilers`	Intel C, C++, and Fortran compilers (classic to OneAPI)
`impi`	`intel-oneapi-mpi`	Intel MPI (classic to OneAPI)
`anaconda3`	`miniconda3`	Conda
`singularity`	`apptainer`	Portable, reproducible Containers
`blas`/`lapack`	`netlib-lapack`	Linear Algebra
	`intel-oneapi-mkl`
	`amdblis`
	`amdlibflame`
	`openblas`
`scalapack`	`netlib-scalapack`
	`intel-oneapi-mkl`
	`amdscalapack`
`nvhpc-hpcx`	`nvhpc`	Nvidia HPC SDK

Note that some packages are compiled with and without CUDA, and some compiled with CUDA are compiled with multiple versions. In this case, the module name and version will have the form MODULE/VERSION_cuda or MODULE/VERSION_cuda-CUDAMAJORVERSION where CUDAMAJORVERSION is the CUDA major version (e.g. 11 or 12).

Software Revision 24.05

Emmy Core Modules (24.05)

Warning

This software revision is no longer updated but can still be loaded and used by running the following command before any module commands:

export PREFERRED_SOFTWARE_STACK=nhr-lmod
source /etc/profile

We recommend to load the appropriate compiler and MPI module first:

module load gcc/11.4.0
module load openmpi/4.1.6

The software packages loaded on each phase are optimized for the particular CPU architecture of that phase (e.g Intel Cascadelake or Intel Sapphirerapids).

List of Modules

Module Names	Description	Homepage
abyss/2.3.5	ABySS is a de novo, parallel, paired-end sequence assembler that is designed for short reads. The single-processor version is useful for assembling genomes up to 100 Mbases in size.	https://www.bcgsc.ca/platform/bioinfo/software/abyss
ansys/2023.1 ansys/2023.2 ansys/2024.1	Ansys offers a comprehensive software suite that spans the entire range of physics, providing access to virtually any field of engineering simulation that a design process requires.	https://www.ansys.com/
aocc/4.1.0	The AOCC compiler system is a high performance, production quality code generation tool. The AOCC environment provides various options to developers when building and optimizing C, C++, and Fortran applications targeting 32-bit and 64-bit Linux platforms. The AOCC compiler system offers a high level of advanced optimizations, multi-threading and processor support that includes global optimization, vectorization, inter-procedural analyses, loop transformations, and code generation. AMD also provides highly optimized libraries, which extract the optimal performance from each x86 processor core when utilized. The AOCC Compiler Suite simplifies and accelerates development and tuning for x86 applications.	https://www.amd.com/en/developer/aocc.html
apptainer/1.1.9 apptainer/1.2.5	Apptainer is an open source container platform designed to be simple, fast, and secure. Many container platforms are available, but Apptainer is designed for ease-of-use on shared systems and in high performance computing (HPC) environments.	https://apptainer.org
aria2/1.36.0	An ultra fast download utility	https://aria2.github.io
autoconf/2.69	Autoconf – system configuration part of autotools	https://www.gnu.org/software/autoconf/
autoconf-archive/2023.02.20	The GNU Autoconf Archive is a collection of more than 500 macros for GNU Autoconf.	https://www.gnu.org/software/autoconf-archive/
bedops/2.4.41	BEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archiving, conversion and other management of genomic data of arbitrary scale.	https://bedops.readthedocs.io
binutils/2.41-gas	GNU binutils, which contain the linker, assembler, objdump and others	https://www.gnu.org/software/binutils/
bison/3.8.2	Bison is a general-purpose parser generator that converts an annotated context-free grammar into a deterministic LR or generalized LR (GLR) parser employing LALR(1) parser tables.	https://www.gnu.org/software/bison/
blast-plus/2.14.1	Basic Local Alignment Search Tool.	https://blast.ncbi.nlm.nih.gov/
boost/1.83.0	Boost provides free peer-reviewed portable C++ source libraries, emphasizing libraries that work well with the C++ Standard Library.	https://www.boost.org
bowtie/1.3.1	Bowtie is an ultrafast, memory-efficient short read aligner for short DNA sequences (reads) from next-gen sequencers.	https://sourceforge.net/projects/bowtie-bio/
cdo/2.2.2	CDO is a collection of command line operators to manipulate and analyse Climate and NWP model Data.	https://code.mpimet.mpg.de/projects/cdo
charmpp/6.10.2 charmpp/6.10.2-smp charmpp/7.0.0 charmpp/7.0.0-smp	Charm++ is a parallel programming framework in C++ supported by an adaptive runtime system, which enhances user productivity and allows programs to run portably from small multicore computers (your laptop) to the largest supercomputers.	https://charmplusplus.org
clblast/1.5.2	CLBlast is a modern, lightweight, performant and tunable OpenCL BLAS library written in C++11. It is designed to leverage the full performance potential of a wide variety of OpenCL devices from different vendors, including desktop and laptop GPUs, embedded GPUs, and other accelerators. CLBlast implements BLAS routines: basic linear algebra subprograms operating on vectors and matrices.	https://cnugteren.github.io/clblast/clblast.html
clinfo/3.0.21.02.21	Print all known information about all available OpenCL platforms and devices in the system.	https://github.com/Oblomov/clinfo
clpeak/1.1.2	Simple OpenCL performance benchmark tool.	https://github.com/krrishnarraj/clpeak
cmake/3.27.7	A cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software.	https://www.cmake.org
comsol/6.1 comsol/6.2	COMSOL Multiphysics is a finite element analyzer, solver, and simulation software package for various physics and engineering applications, especially coupled phenomena and multiphysics.	https://www.comsol.com/
cp2k/2023.2 cp2k/2024.1 cp2k/2025.1	CP2K is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems	https://www.cp2k.org
cpmd/4.3	The CPMD code is a parallelized plane wave / pseudopotential implementation of Density Functional Theory, particularly designed for ab-initio molecular dynamics.	https://www.cpmd.org/wordpress/
crest/2.12	Conformer-Rotamer Ensemble Sampling Tool	https://github.com/crest-lab/crest
ddd/3.3.12	A graphical front-end for command-line debuggers such as GDB, DBX, WDB, Ladebug, JDB, XDB, the Perl debugger, the bash debugger bashdb, the GNU Make debugger remake, or the Python debugger pydb.	https://www.gnu.org/software/ddd
diamond/2.1.7	DIAMOND is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data.	https://ab.inf.uni-tuebingen.de/software/diamond
eccodes/2.34.0	ecCodes is a package developed by ECMWF for processing meteorological data in GRIB (1/2), BUFR (3/4) and GTS header formats.	https://software.ecmwf.int/wiki/display/ECC/ecCodes+Home
exciting/oxygen	exciting is a full-potential all-electron density-functional-theory package implementing the families of linearized augmented planewave methods. It can be applied to all kinds of materials, irrespective of the atomic species involved, and also allows for exploring the physics of core electrons. A particular focus are excited states within many-body perturbation theory.	https://exciting-code.org/
ffmpeg/6.0	FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video.	https://ffmpeg.org
fftw/3.3.10	FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST). We believe that FFTW, which is free software, should become the FFT library of choice for most applications.	https://www.fftw.org
fish/3.6.1	fish is a smart and user-friendly command line shell for OS X, Linux, and the rest of the family.	https://fishshell.com/
fleur/5.1	FLEUR (Full-potential Linearised augmented plane wave in EURope) is a code family for calculating groundstate as well as excited-state properties of solids within the context of density functional theory (DFT).	https://www.flapw.de/MaX-5.1
foam-extend/5.0-source	The Extend Project is a fork of the OpenFOAM open-source library for Computational Fluid Dynamics (CFD). This offering is not approved or endorsed by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark.	https://sourceforge.net/projects/foam-extend/
gatk/3.8.1 gatk/4.4.0.0	Genome Analysis Toolkit Variant Discovery in High-Throughput Sequencing Data	https://gatk.broadinstitute.org/hc/en-us
gaussian/16-C.02	Gaussian is a computer program for computational chemistry	https://gaussian.com/
gcc/11.4.0 gcc/12.3.0 gcc/13.2.0 gcc/14.2.0 gcc/9.5.0	The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Ada, and Go, as well as libraries for these languages.	https://gcc.gnu.org
gdal/3.7.3	GDAL: Geospatial Data Abstraction Library.	https://www.gdal.org/
gdb/8.1	GDB, the GNU Project debugger, allows you to see what is going on ‘inside’ another program while it executes – or what another program was doing at the moment it crashed.	https://www.gnu.org/software/gdb
getmutils/1.0	Utilities collection for GETM (https://getm.eu/)
git/2.42.0	Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.	https://git-scm.com
git-lfs/3.3.0	Git LFS is a system for managing and versioning large files in association with a Git repository. Instead of storing the large files within the Git repository as blobs, Git LFS stores special ‘pointer files’ in the repository, while storing the actual file contents on a Git LFS server.	https://git-lfs.github.com
globalarrays/5.8.2	Global Arrays (GA) is a Partitioned Global Address Space (PGAS) programming model.	https://hpc.pnl.gov/globalarrays/
gmake/4.4.1	GNU Make is a tool which controls the generation of executables and other non-source files of a program from the program’s source files.	https://www.gnu.org/software/make/
gmp/6.2.1	GMP is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating-point numbers.	https://gmplib.org
gnuplot/5.4.3	Gnuplot is a portable command-line driven graphing utility for Linux, OS/2, MS Windows, OSX, VMS, and many other platforms. The source code is copyrighted but freely distributed (i.e., you don’t have to pay for it). It was originally created to allow scientists and students to visualize mathematical functions and data interactively, but has grown to support many non-interactive uses such as web scripting. It is also used as a plotting engine by third-party applications like Octave. Gnuplot has been supported and under active development since 1986	http://www.gnuplot.info
go/1.21.3	The golang compiler and build environment	https://go.dev
grads/2.2.3	The Grid Analysis and Display System (GrADS) is an interactive desktop tool that is used for easy access, manipulation, and visualization of earth science data. GrADS has two data models for handling gridded and station data. GrADS supports many data file formats, including binary (stream or sequential), GRIB (version 1 and 2), NetCDF, HDF (version 4 and 5), and BUFR (for station data).	http://cola.gmu.edu/grads/grads.php
gromacs/2019.6 gromacs/2019.6-plumed gromacs/2022.5-plumed gromacs/2023-plumed gromacs/2023.3	GROMACS is a molecular dynamics package primarily designed for simulations of proteins, lipids and nucleic acids. It was originally developed in the Biophysical Chemistry department of University of Groningen, and is now maintained by contributors in universities and research centers across the world.	https://www.gromacs.org
gsl/2.7.1	The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. It is free software under the GNU General Public License. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.	https://www.gnu.org/software/gsl
hdf5/1.12.2 hdf5/1.14.3	HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of data types, and is designed for flexible and efficient I/O and for high volume and complex data.	https://portal.hdfgroup.org
hpl/2.3	HPL is a software package that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers. It can thus be regarded as a portable as well as freely available implementation of the High Performance Computing Linpack Benchmark.	https://www.netlib.org/benchmark/hpl/
igv/2.12.3	The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.	https://software.broadinstitute.org/software/igv/home
imagemagick/7.1.1-11	ImageMagick is a software suite to create, edit, compose, or convert bitmap images.	https://www.imagemagick.org
intel-oneapi-advisor/2023.2.0	Intel Advisor is a design and analysis tool for developing performant code. The tool supports C, C++, Fortran, SYCL, OpenMP, OpenCL code, and Python. It helps with the following: Performant CPU Code: Design your application for efficient threading, vectorization, and memory use. Efficient GPU Offload: Identify parts of the code that can be profitably offloaded. Optimize the code for compute and memory.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/advisor.html
intel-oneapi-compilers/2023.2.1	Intel oneAPI Compilers. Includes: icx, icpx, ifx, and ifort. Releases before 2024.0 include icc/icpc	https://software.intel.com/content/www/us/en/develop/tools/oneapi.html
intel-oneapi-dal/2023.2.0	Intel oneAPI Data Analytics Library (oneDAL) is a library that helps speed up big data analysis by providing highly optimized algorithmic building blocks for all stages of data analytics (preprocessing, transformation, analysis, modeling, validation, and decision making) in batch, online, and distributed processing modes of computation. The library optimizes data ingestion along with algorithmic computation to increase throughput and scalability.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html
intel-oneapi-dnn/2023.2.0	The Intel oneAPI Deep Neural Network Library (oneDNN) helps developers improve productivity and enhance the performance of their deep learning frameworks. It supports key data type formats, including 16 and 32-bit floating point, bfloat16, and 8-bit integers and implements rich operators, including convolution, matrix multiplication, pooling, batch normalization, activation functions, recurrent neural network (RNN) cells, and long short-term memory (LSTM) cells.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onednn.html
intel-oneapi-inspector/2023.2.0	Intel Inspector is a dynamic memory and threading error debugger for C, C++, and Fortran applications that run on Windows and Linux operating systems. Save money: locate the root cause of memory, threading, and persistence errors before you release. Save time: simplify the diagnosis of difficult errors by breaking into the debugger just before the error occurs. Save effort: use your normal debug or production build to catch and debug errors. Check all code, including third-party libraries with unavailable sources.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/inspector.html
intel-oneapi-mkl/2023.2.0	Intel oneAPI Math Kernel Library (Intel oneMKL; formerly Intel Math Kernel Library or Intel MKL), is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html
intel-oneapi-tbb/2021.10.0	Intel oneAPI Threading Building Blocks (oneTBB) is a flexible performance library that simplifies the work of adding parallelism to complex applications across accelerated architectures, even if you are not a threading expert.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onetbb.html
intel-oneapi-vtune/2023.2.0	Intel VTune Profiler is a profiler to optimize application performance, system performance, and system configuration for HPC, cloud, IoT, media, storage, and more. CPU, GPU, and FPGA: Tune the entire application’s performance–not just the accelerated portion. Multilingual: Profile SYCL, C, C++, C#, Fortran, OpenCL code, Python, Google Go programming language, Java, .NET, Assembly, or any combination of languages. System or Application: Get coarse-grained system data for an extended period or detailed results mapped to source code. Power: Optimize performance while avoiding power and thermal-related throttling.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html
iq-tree/2.2.2.7	IQ-TREE Efficient software for phylogenomic inference	http://www.iqtree.org
jags/4.3.0	JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation not wholly unlike BUGS	https://mcmc-jags.sourceforge.net/
jq/1.6	jq is a lightweight and flexible command-line JSON processor.	https://stedolan.github.io/jq/
julia/1.10.0 julia/1.9.4	The Julia Language: A fresh approach to technical computing This package installs the x86_64-linux-gnu version provided by Julia Computing	https://julialang.org/
kraken2/2.1.2	Kraken2 is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.	https://ccb.jhu.edu/software/kraken2/
lammps/20230802 lammps/20230802-fftw	LAMMPS stands for Large-scale Atomic/Molecular Massively Parallel Simulator.	https://www.lammps.org/
lftp/4.9.2	LFTP is a sophisticated file transfer program supporting a number of network protocols (ftp, http, sftp, fish, torrent).	https://lftp.yar.ru/
libaec/1.0.6	Libaec provides fast lossless compression of 1 up to 32 bit wide signed or unsigned integers (samples). It implements Golomb-Rice compression method under the BSD license and includes a free drop-in replacement for the SZIP library.	https://gitlab.dkrz.de/k202009/libaec
libpng/1.6.39	libpng is the official PNG reference library.	http://www.libpng.org/pub/png/libpng.html
libtiff/4.5.1	LibTIFF - Tag Image File Format (TIFF) Library and Utilities.	http://www.simplesystems.org/libtiff/
libxc/6.2.2	Libxc is a library of exchange-correlation functionals for density-functional theory.	https://libxc.gitlab.io/
libxml2/2.10.3	Libxml2 is the XML C parser and toolkit developed for the Gnome project (but usable outside of the Gnome platform), it is free software available under the MIT License.	https://gitlab.gnome.org/GNOME/libxml2/-/wikis/home
likwid/5.2.2	Likwid is a simple to install and use toolsuite of command line applications for performance oriented programmers. It works for Intel and AMD processors on the Linux operating system. This version uses the perf_event backend which reduces the feature set but allows user installs. See https://github.com/RRZE-HPC/likwid/wiki/TutorialLikwidPerf#feature-limitations for information.	https://hpc.fau.de/research/tools/likwid/
llvm/17.0.4	The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Despite its name, LLVM has little to do with traditional virtual machines, though it does provide helpful libraries that can be used to build them. The name ‘LLVM’ itself is not an acronym; it is the full name of the project.	https://llvm.org/
masurca/4.1.0 masurca/4.1.1	MaSuRCA is whole genome assembly software. It combines the efficiency of the de Bruijn graph and Overlap-Layout-Consensus (OLC) approaches.	https://www.genome.umd.edu/masurca.html
mercurial/6.4.5	Mercurial is a free, distributed source control management tool.	https://www.mercurial-scm.org
meson/1.2.2	Meson is a portable open source build system meant to be both extremely fast, and as user friendly as possible.	https://mesonbuild.com/
mhm2/2.2.0.0	MetaHipMer (MHM) is a de novo metagenome short-read assembler, which is written in UPC++, CUDA and HIP, and runs efficiently on both single servers and on multinode supercomputers, where it can scale up to coassemble terabase-sized metagenomes.	https://bitbucket.org/berkeleylab/mhm2/
micromamba/1.4.2	Mamba is a fast, robust, and cross-platform package manager (Miniconda alternative).	https://mamba.readthedocs.io/
miniconda3/22.11.1	The minimalist bootstrap toolset for conda and Python3.	https://docs.anaconda.com/miniconda/
miniforge3/4.8.3-4-Linux-x86_64	Miniforge3 is a minimal installer for conda specific to conda-forge.	https://github.com/conda-forge/miniforge
molden/6.7	A package for displaying Molecular Density from various Ab Initio packages	https://www.theochem.ru.nl/molden/
mono/6.12.0.122	Mono is a software platform designed to allow developers to easily create cross platform applications. It is an open source implementation of Microsoft’s .NET Framework based on the ECMA standards for C# and the Common Language Runtime.	https://www.mono-project.com/
mpc/1.3.1	Gnu Mpc is a C library for the arithmetic of complex numbers with arbitrarily high precision and correct rounding of the result.	https://www.multiprecision.org
mpfr/3.1.6 mpfr/4.2.0	The MPFR library is a C library for multiple-precision floating-point computations with correct rounding.	https://www.mpfr.org/
mpifileutils/0.11.1	mpiFileUtils is a suite of MPI-based tools to manage large datasets, which may vary from large directory trees to large files. High-performance computing users often generate large datasets with parallel applications that run with many processes (millions in some cases). However those users are then stuck with single-process tools like cp and rm to manage their datasets. This suite provides MPI-based tools to handle typical jobs like copy, remove, and compare for such datasets, providing speedups of up to 20-30x.	https://github.com/hpc/mpifileutils
mumps/5.2.0 mumps/5.5.1	MUMPS: a MUltifrontal Massively Parallel sparse direct Solver	https://mumps-solver.org
muscle5/5.1.0	MUSCLE is widely-used software for making multiple alignments of biological sequences.	https://drive5.com/muscle5/
must/1.9.0	MUST detects usage errors of the Message Passing Interface (MPI) and reports them to the user. As MPI calls are complex and usage errors common, this functionality is extremely helpful for application developers that want to develop correct MPI applications. This includes errors that already manifest: segmentation faults or incorrect results as well as many errors that are not visible to the application developer or do not manifest on a certain system or MPI implementation.	https://www.i12.rwth-aachen.de/go/id/nrbe
namd/2.14 namd/2.14-smp namd/3.0 namd/3.0-smp namd/3.0.1 namd/3.0.1-smp	NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.	https://www.ks.uiuc.edu/Research/namd/
nco/5.1.6	The NCO toolkit manipulates and analyzes data stored in netCDF-accessible formats	https://nco.sourceforge.net/
ncview/2.1.9	Simple viewer for NetCDF files.	https://cirrus.ucsd.edu/ncview/
netcdf-c/4.9.2	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the C distribution.	https://www.unidata.ucar.edu/software/netcdf
netcdf-fortran/4.6.1-mpi	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the Fortran distribution.	https://www.unidata.ucar.edu/software/netcdf
netgen/5.3.1	NETGEN is an automatic 3d tetrahedral mesh generator. It accepts input from constructive solid geometry (CSG) or boundary representation (BRep) from STL file format. The connection to a geometry kernel allows the handling of IGES and STEP files. NETGEN contains modules for mesh optimization and hierarchical mesh refinement.	https://ngsolve.org/
netlib-lapack/3.11.0	LAPACK version 3.X is a comprehensive FORTRAN library that does linear algebra operations including matrix inversions, least squared solutions to linear sets of equations, eigenvector analysis, singular value decomposition, etc. It is a very comprehensive and reputable package that has found extensive use in the scientific community.	https://www.netlib.org/lapack/
netlib-scalapack/2.2.0	ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines	https://www.netlib.org/scalapack/
nextflow/23.10.0	Data-driven computational pipelines.	https://www.nextflow.io
ninja/1.11.1	Ninja is a small build system with a focus on speed. It differs from other build systems in two major respects: it is designed to have its input files generated by a higher-level build system, and it is designed to run builds as fast as possible.	https://ninja-build.org/
nvhpc/23.9	The NVIDIA HPC SDK is a comprehensive suite of compilers, libraries and tools essential to maximizing developer productivity and the performance and portability of HPC applications. The NVIDIA HPC SDK C, C++, and Fortran compilers support GPU acceleration of HPC modeling and simulation applications with standard C++ and Fortran, OpenACC directives, and CUDA. GPU-accelerated math libraries maximize performance on common HPC algorithms, and optimized communications libraries enable standards-based multi-GPU and scalable systems programming. Performance profiling and debugging tools simplify porting and optimization of HPC applications.	https://developer.nvidia.com/hpc-sdk
ocl-icd/2.3.1	This package aims at creating an Open Source alternative to vendor specific OpenCL ICD loaders.	https://github.com/OCL-dev/ocl-icd
octave/8.2.0	GNU Octave is a high-level language, primarily intended for numerical computations.	https://www.gnu.org/software/octave/
opa-psm2/11.2.230	Omni-Path Performance Scaled Messaging 2 (PSM2) library	https://github.com/cornelisnetworks/opa-psm2
openbabel/3.1.1	Open Babel is a chemical toolbox designed to speak the many languages of chemical data. It’s an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, solid-state materials, biochemistry, or related areas.	https://openbabel.org/docs/index.html
openblas/0.3.24	OpenBLAS: An optimized BLAS library	https://www.openblas.net
opencl-c-headers/2022.01.04	OpenCL (Open Computing Language) C header files	https://www.khronos.org/registry/OpenCL/
opencl-clhpp/2.0.16	C++ headers for OpenCL development	https://www.khronos.org/registry/OpenCL/
opencl-headers/3.0	Bundled OpenCL (Open Computing Language) header files	https://www.khronos.org/registry/OpenCL/
opencoarrays/2.10.1	OpenCoarrays is an open-source software project that produces an application binary interface (ABI) supporting coarray Fortran (CAF) compilers, an application programming interface (API) that supports users of non-CAF compilers, and an associated compiler wrapper and program launcher.	http://www.opencoarrays.org/
openfoam/2306 openfoam/2306-source	OpenFOAM is a GPL-open-source C++ CFD-toolbox. This offering is supported by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark. OpenCFD Ltd has been developing and releasing OpenFOAM since its debut in 2004.	https://www.openfoam.com/
openfoam-org/10 openfoam-org/10-source openfoam-org/6 openfoam-org/6-source openfoam-org/7 openfoam-org/7-source openfoam-org/8 openfoam-org/8-source	OpenFOAM is a GPL-open-source C++ CFD-toolbox. The openfoam.org release is managed by the OpenFOAM Foundation Ltd as a licensee of the OPENFOAM trademark. This offering is not approved or endorsed by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark.	https://www.openfoam.org/
openjdk/11.0.20.1_1 openjdk/17.0.8.1_1	The free and open-source java implementation	https://jdk.java.net
openmpi/4.1.6	An open source Message Passing Interface implementation.	https://www.open-mpi.org
osu-micro-benchmarks/7.3	The Ohio MicroBenchmark suite is a collection of independent MPI message passing performance microbenchmarks developed and written at The Ohio State University. It includes traditional benchmarks and performance measures such as latency, bandwidth and host overhead and can be used for both traditional and GPU-enhanced nodes.	https://mvapich.cse.ohio-state.edu/benchmarks/
parallel/20220522	GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input.	https://www.gnu.org/software/parallel/
paraview/5.11.2	ParaView is an open-source, multi-platform data analysis and visualization application. This package includes the Catalyst in-situ library for versions 5.7 and greater, otherwise use the catalyst package.	https://www.paraview.org
pbmpi/1.9	A Bayesian software for phylogenetic reconstruction using mixture models	https://github.com/bayesiancook/pbmpi
perl/5.38.0	Perl 5 is a highly capable, feature-rich programming language with over 27 years of development.	https://www.perl.org
perl-list-moreutils/0.430	Provide the stuff missing in List::Util	https://metacpan.org/pod/List::MoreUtils
perl-uri/5.08	Uniform Resource Identifiers (absolute and relative)	https://metacpan.org/pod/URI
petsc/3.20.1-complex petsc/3.20.1-real	PETSc is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations.	https://petsc.org
pigz/2.7	A parallel implementation of gzip for modern multi-processor, multi-core machines.	https://zlib.net/pigz/
pocl/5.0	Portable Computing Language (pocl) is an open source implementation of the OpenCL standard which can be easily adapted for new targets and devices, both for homogeneous CPU and heterogeneous GPUs/accelerators.	https://portablecl.org
proj/9.2.1	PROJ is a generic coordinate transformation software, that transforms geospatial coordinates from one coordinate reference system (CRS) to another. This includes cartographic projections as well as geodetic transformations.	https://proj.org/
psi4/1.8.2	Psi4 is an open-source suite of ab initio quantum chemistry programs designed for efficient, high-accuracy simulations of a variety of molecular properties.	https://www.psicode.org/
py-mpi4py/3.1.4	This package provides Python bindings for the Message Passing Interface (MPI) standard. It is implemented on top of the MPI-1/MPI-2 specification and exposes an API which grounds on the standard MPI-2 C++ bindings.	https://pypi.org/project/mpi4py/
py-nvitop/1.4.0	An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.	https://nvitop.readthedocs.io/
py-pip/23.1.2	The PyPA recommended tool for installing Python packages.	https://pip.pypa.io/
py-reportseff/2.7.6	A python script for tabular display of slurm efficiency information.	https://github.com/troycomi/reportseff
python/3.10.13 python/3.11.6 python/3.9.18	The Python programming language.	https://www.python.org/
qt/5.15.11	Qt is a comprehensive cross-platform C++ application framework.	https://qt.io
quantum-espresso/6.7 quantum-espresso/7.2 quantum-espresso/7.3.1	Quantum ESPRESSO is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.	https://quantum-espresso.org
r/4.4.0	R is ‘GNU S’, a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc. Please consult the R project homepage for further information.	https://www.r-project.org
rclone/1.63.1	Rclone is a command line program to sync files and directories to and from various cloud storage providers	https://rclone.org
relion/3.1.3 relion/4.0.1	RELION (for REgularised LIkelihood OptimisatioN, pronounce rely-on) is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).	https://www2.mrc-lmb.cam.ac.uk/relion
repeatmasker/4.1.5	RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.	https://www.repeatmasker.org
repeatmodeler/2.0.4	RepeatModeler is a de-novo repeat family identification and modeling package.	https://github.com/Dfam-consortium/RepeatModeler
revbayes/1.1.1	Bayesian phylogenetic inference using probabilistic graphical models and an interpreted language.	https://revbayes.github.io
rust/1.70.0	The Rust programming language toolchain.	https://www.rust-lang.org
samtools/1.17	SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format	https://www.htslib.org
scala/2.13.1	Scala is a general-purpose programming language providing support for functional programming and a strong static type system. Designed to be concise, many of Scala’s design decisions were designed to build from criticisms of Java.	https://www.scala-lang.org/
scalasca/2.6.1	Scalasca is a software tool that supports the performance optimization of parallel programs by measuring and analyzing their runtime behavior. The analysis identifies potential performance bottlenecks - in particular those concerning communication and synchronization - and offers guidance in exploring their causes.	https://www.scalasca.org
scorep/8.3	The Score-P measurement infrastructure is a highly scalable and easy-to-use tool suite for profiling, event tracing, and online analysis of HPC applications.	https://www.vi-hps.org/projects/score-p
siesta/4.0.2	SIESTA performs electronic structure calculations and ab initio molecular dynamics simulations of molecules and solids.	https://departments.icmab.es/leem/siesta/
skopeo/0.1.40	skopeo is a command line utility that performs various operations on container images and image repositories.	https://github.com/containers/skopeo
slepc/3.20.0	Scalable Library for Eigenvalue Problem Computations.	https://slepc.upv.es
snakemake/7.22.0	Snakemake is an MIT-licensed workflow management system.	https://snakemake.readthedocs.io/en/stable/
spack/0.21.2	Spack is a multi-platform package manager that builds and installs multiple versions and configurations of software. It works on Linux, macOS, and many supercomputers. Spack is non-destructive: installing a new version of a package does not break existing installations, so many configurations of the same package can coexist.	https://spack.io/
spark/3.1.1	Apache Spark is a fast and general engine for large-scale data processing.	https://spark.apache.org
sqlite/3.43.2	SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine.	https://www.sqlite.org
squashfuse/0.5.0	squashfuse - Mount SquashFS archives using Filesystem in USErspace (FUSE)	https://github.com/vasi/squashfuse
starccm/18.06.007 starccm/19.04.007	STAR-CCM+: Simcenter STAR-CCM+ is a multiphysics computational fluid dynamics (CFD) simulation software that enables CFD engineers to model the complexity and explore the possibilities of products operating under real-world conditions.	https://plm.sw.siemens.com/en-US/simcenter/fluids-thermal-simulation/star-ccm/
subread/2.0.6	The Subread software package is a tool kit for processing next-gen sequencing data.	https://subread.sourceforge.net/
subversion/1.14.2	Apache Subversion - an open source version control system.	https://subversion.apache.org/
tcl/8.5.19 tcl/8.6.12	Tcl (Tool Command Language) is a very powerful but easy to learn dynamic programming language, suitable for a very wide range of uses, including web and desktop applications, networking, administration, testing and many more. Open source and business-friendly, Tcl is a mature yet evolving language that is truly cross platform, easily deployed and highly extensible.	https://www.tcl.tk/
tk/8.6.11	Tk is a graphical user interface toolkit that takes developing desktop applications to a higher level than conventional approaches. Tk is the standard GUI not only for Tcl, but for many other dynamic languages, and can produce rich, native applications that run unchanged across Windows, Mac OS X, Linux and more.	https://www.tcl.tk
tkdiff/5.7	TkDiff is a graphical front end to the diff program. It provides a side-by-side view of the differences between two text files, along with several innovative features such as diff bookmarks, a graphical map of differences for quick navigation, and a facility for slicing diff regions to achieve exactly the merge output desired.	https://tkdiff.sourceforge.io/
transabyss/2.0.1	de novo assembly of RNA-Seq data using ABySS
turbomole/7.6 turbomole/7.8.1	TURBOMOLE: Program Package for ab initio Electronic Structure Calculations.	https://www.turbomole.org/
udunits/2.2.28	Automated units conversion	https://www.unidata.ucar.edu/software/udunits
valgrind/3.20.0	An instrumentation framework for building dynamic analysis.	https://valgrind.org/
vampir/10.4.1	Vampir and Score-P provide a performance tool framework with special focus on highly-parallel applications. Performance data is collected from multi-process (MPI, SHMEM), thread-parallel (OpenMP, Pthreads), as well as accelerator-based paradigms (CUDA, HIP, OpenCL, OpenACC).	https://vampir.eu/
vampirserver/10.4.1	Vampir and Score-P provide a performance tool framework with special focus on highly-parallel applications. Performance data is collected from multi-process (MPI, SHMEM), thread-parallel (OpenMP, Pthreads), as well as accelerator-based paradigms (CUDA, HIP, OpenCL, OpenACC).	https://vampir.eu/
vim/9.0.0045	Vim is a highly configurable text editor built to enable efficient text editing. It is an improved version of the vi editor distributed with most UNIX systems. Vim is often called a ‘programmer’s editor,’ and so useful for programming that many consider it an entire IDE. It’s not just for programmers, though. Vim is perfect for all kinds of text editing, from composing email to editing configuration files.	https://www.vim.org
vmd/1.9.3	VMD provides user-editable materials which can be applied to molecular geometry.	https://www.ks.uiuc.edu/Research/vmd/
xtb/6.6.0	Semiempirical extended tight binding program package	https://xtb-docs.readthedocs.org
xz/5.4.1	XZ Utils is free general-purpose data compression software with high compression ratio. XZ Utils were written for POSIX-like systems, but also work on some not-so-POSIX systems. XZ Utils are the successor to LZMA Utils.	https://tukaani.org/xz/
zlib-ng/2.1.4	zlib replacement with optimizations for next generation systems.	https://github.com/zlib-ng/zlib-ng
zsh/5.8	Zsh is a shell designed for interactive use, although it is also a powerful scripting language. Many of the useful features of bash, ksh, and tcsh were incorporated into zsh; many original features were added.	https://www.zsh.org
zstd/1.5.5	Zstandard, or zstd as short version, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios.	https://facebook.github.io/zstd/

Emmy Intel Modules (24.05)

Warning

This software revision is no longer updated but can still be loaded and used by running the following command before any module commands:

export PREFERRED_SOFTWARE_STACK=nhr-lmod
source /etc/profile

To load these modules you first need to load the intel stack:

module load intel-oneapi-compilers/2023.2.1
module load intel-oneapi-mpi/2021.10.0

List of Modules

Module Names	Description	Homepage
cdo/2.2.2 cdo/2.2.2-hdf5-1.10	CDO is a collection of command line operators to manipulate and analyse Climate and NWP model Data.	https://code.mpimet.mpg.de/projects/cdo
eccodes/2.34.0 eccodes/2.34.0-hdf5-1.10	ecCodes is a package developed by ECMWF for processing meteorological data in GRIB (1/2), BUFR (3/4) and GTS header formats.	https://software.ecmwf.int/wiki/display/ECC/ecCodes+Home
globalarrays/5.8.2	Global Arrays (GA) is a Partitioned Global Address Space (PGAS) programming model.	https://hpc.pnl.gov/globalarrays/
hdf5/1.10.7 hdf5/1.10.7-precise-fp hdf5/1.12.2 hdf5/1.14.3 hdf5/1.14.3-precise-fp	HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of data types, and is designed for flexible and efficient I/O and for high volume and complex data.	https://portal.hdfgroup.org
hpl/2.3	HPL is a software package that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers. It can thus be regarded as a portable as well as freely available implementation of the High Performance Computing Linpack Benchmark.	https://www.netlib.org/benchmark/hpl/
intel-oneapi-mkl/2023.2.0	Intel oneAPI Math Kernel Library (Intel oneMKL; formerly Intel Math Kernel Library or Intel MKL), is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html
mumps/5.5.1	MUMPS: a MUltifrontal Massively Parallel sparse direct Solver	https://mumps-solver.org
nco/5.1.6 nco/5.1.6-hdf5-1.10	The NCO toolkit manipulates and analyzes data stored in netCDF-accessible formats	https://nco.sourceforge.net/
ncview/2.1.9	Simple viewer for NetCDF files.	https://cirrus.ucsd.edu/ncview/
netcdf-c/4.9.2 netcdf-c/4.9.2-hdf5-1.10 netcdf-c/4.9.2-hdf5-1.10-precise-fp netcdf-c/4.9.2-hdf5-1.12-precise-fp	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the C distribution.	https://www.unidata.ucar.edu/software/netcdf
netcdf-fortran/4.5.3-hdf5-1.10 netcdf-fortran/4.5.3-hdf5-1.10-precise-fp netcdf-fortran/4.6.1-hdf5-1.12-precise-fp netcdf-fortran/4.6.1-mpi	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the Fortran distribution.	https://www.unidata.ucar.edu/software/netcdf
osu-micro-benchmarks/7.3	The Ohio MicroBenchmark suite is a collection of independent MPI message passing performance microbenchmarks developed and written at The Ohio State University. It includes traditional benchmarks and performance measures such as latency, bandwidth and host overhead and can be used for both traditional and GPU-enhanced nodes.	https://mvapich.cse.ohio-state.edu/benchmarks/
petsc/3.20.1-complex petsc/3.20.1-real	PETSc is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations.	https://petsc.org
quantum-espresso/7.2 quantum-espresso/7.3.1	Quantum ESPRESSO is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.	https://quantum-espresso.org
vasp/6.4.3	The Vienna Ab initio Simulation Package (VASP) is a computer program for atomic scale materials modelling, e.g. electronic structure calculations and quantum-mechanical molecular dynamics, from first principles.	https://vasp.at

Grete Core Modules (24.05)

Warning

This software revision is no longer updated but can still be loaded and used by running the following command before any module commands:

export PREFERRED_SOFTWARE_STACK=nhr-lmod
source /etc/profile

We recommend to load the appropriate compiler and MPI module first:

module load gcc/11.4.0
module load openmpi/4.1.6

The software packages loaded on each phase are optimized for the particular CPU and GPU architecture of that phase (e.g AMD Rome + A100 or Intel Sapphirerapids + H100).

List of Modules

Module Names	Description	Homepage
abyss/2.3.5-cuda-12	ABySS is a de novo, parallel, paired-end sequence assembler that is designed for short reads. The single-processor version is useful for assembling genomes up to 100 Mbases in size.	https://www.bcgsc.ca/platform/bioinfo/software/abyss
amdblis/4.1	AMD Optimized BLIS.	https://www.amd.com/en/developer/aocl/blis.html
amdfftw/4.1-cuda-11 amdfftw/4.1-cuda-12	FFTW (AMD Optimized version) is a comprehensive collection of fast C routines for computing the Discrete Fourier Transform (DFT) and various special cases thereof.	https://www.amd.com/en/developer/aocl/fftw.html
amdlibflame/4.1	libFLAME (AMD Optimized version) is a portable library for dense matrix computations, providing much of the functionality present in Linear Algebra Package (LAPACK). It includes a compatibility layer, FLAPACK, which includes complete LAPACK implementation.	https://www.amd.com/en/developer/aocl/blis.html#libflame
amdscalapack/4.1-cuda-11 amdscalapack/4.1-cuda-12	ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines. It depends on external libraries including BLAS and LAPACK for Linear Algebra computations.	https://www.amd.com/en/developer/aocl/scalapack.html
ansys/2023.1 ansys/2023.2 ansys/2024.1	Ansys offers a comprehensive software suite that spans the entire range of physics, providing access to virtually any field of engineering simulation that a design process requires.	https://www.ansys.com/
aocc/4.1.0	The AOCC compiler system is a high performance, production quality code generation tool. The AOCC environment provides various options to developers when building and optimizing C, C++, and Fortran applications targeting 32-bit and 64-bit Linux platforms. The AOCC compiler system offers a high level of advanced optimizations, multi-threading and processor support that includes global optimization, vectorization, inter-procedural analyses, loop transformations, and code generation. AMD also provides highly optimized libraries, which extract the optimal performance from each x86 processor core when utilized. The AOCC Compiler Suite simplifies and accelerates development and tuning for x86 applications.	https://www.amd.com/en/developer/aocc.html
apptainer/1.1.9 apptainer/1.2.5	Apptainer is an open source container platform designed to be simple, fast, and secure. Many container platforms are available, but Apptainer is designed for ease-of-use on shared systems and in high performance computing (HPC) environments.	https://apptainer.org
aria2/1.36.0	An ultra fast download utility	https://aria2.github.io
autoconf/2.69	Autoconf – system configuration part of autotools	https://www.gnu.org/software/autoconf/
autoconf-archive/2023.02.20	The GNU Autoconf Archive is a collection of more than 500 macros for GNU Autoconf.	https://www.gnu.org/software/autoconf-archive/
bedops/2.4.41	BEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archiving, conversion and other management of genomic data of arbitrary scale.	https://bedops.readthedocs.io
binutils/2.41-gas	GNU binutils, which contain the linker, assembler, objdump and others	https://www.gnu.org/software/binutils/
bison/3.8.2	Bison is a general-purpose parser generator that converts an annotated context-free grammar into a deterministic LR or generalized LR (GLR) parser employing LALR(1) parser tables.	https://www.gnu.org/software/bison/
blast-plus/2.14.1	Basic Local Alignment Search Tool.	https://blast.ncbi.nlm.nih.gov/
boost/1.83.0-aocl-cuda-12 boost/1.83.0-cuda-12	Boost provides free peer-reviewed portable C++ source libraries, emphasizing libraries that work well with the C++ Standard Library.	https://www.boost.org
bowtie/1.3.1	Bowtie is an ultrafast, memory-efficient short read aligner for short DNA sequences (reads) from next-gen sequencers.	https://sourceforge.net/projects/bowtie-bio/
cdo/2.2.2-cuda-12	CDO is a collection of command line operators to manipulate and analyse Climate and NWP model Data.	https://code.mpimet.mpg.de/projects/cdo
charmpp/6.10.2-smp charmpp/7.0.0-smp	Charm++ is a parallel programming framework in C++ supported by an adaptive runtime system, which enhances user productivity and allows programs to run portably from small multicore computers (your laptop) to the largest supercomputers.	https://charmplusplus.org
clblast/1.5.2	CLBlast is a modern, lightweight, performant and tunable OpenCL BLAS library written in C++11. It is designed to leverage the full performance potential of a wide variety of OpenCL devices from different vendors, including desktop and laptop GPUs, embedded GPUs, and other accelerators. CLBlast implements BLAS routines: basic linear algebra subprograms operating on vectors and matrices.	https://cnugteren.github.io/clblast/clblast.html
clinfo/3.0.21.02.21	Print all known information about all available OpenCL platforms and devices in the system.	https://github.com/Oblomov/clinfo
clpeak/1.1.2	Simple OpenCL performance benchmark tool.	https://github.com/krrishnarraj/clpeak
cmake/3.27.7	A cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software.	https://www.cmake.org
comsol/6.1 comsol/6.2	COMSOL Multiphysics is a finite element analyzer, solver, and simulation software package for various physics and engineering applications, especially coupled phenomena and multiphysics.	https://www.comsol.com/
cp2k/2023.2-cuda-12 cp2k/2024.1-cuda-12 cp2k/2025.1-cuda-12	CP2K is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems	https://www.cp2k.org
cpmd/4.3-cuda-12	The CPMD code is a parallelized plane wave / pseudopotential implementation of Density Functional Theory, particularly designed for ab-initio molecular dynamics.	https://www.cpmd.org/wordpress/
crest/2.12	Conformer-Rotamer Ensemble Sampling Tool	https://github.com/crest-lab/crest
cuda/11.8.0 cuda/12.2.1	CUDA is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).	https://developer.nvidia.com/cuda-zone
cudnn/8.9.7.29-11-cuda-11 cudnn/8.9.7.29-12-cuda-12	NVIDIA cuDNN is a GPU-accelerated library of primitives for deep neural networks	https://developer.nvidia.com/cudnn
ddd/3.3.12	A graphical front-end for command-line debuggers such as GDB, DBX, WDB, Ladebug, JDB, XDB, the Perl debugger, the bash debugger bashdb, the GNU Make debugger remake, or the Python debugger pydb.	https://www.gnu.org/software/ddd
diamond/2.1.7	DIAMOND is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data.	https://ab.inf.uni-tuebingen.de/software/diamond
eccodes/2.34.0-cuda-12	ecCodes is a package developed by ECMWF for processing meteorological data in GRIB (1/2), BUFR (3/4) and GTS header formats.	https://software.ecmwf.int/wiki/display/ECC/ecCodes+Home
exciting/oxygen-cuda-12	exciting is a full-potential all-electron density-functional-theory package implementing the families of linearized augmented planewave methods. It can be applied to all kinds of materials, irrespective of the atomic species involved, and also allows for exploring the physics of core electrons. A particular focus are excited states within many-body perturbation theory.	https://exciting-code.org/
ffmpeg/6.0	FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video.	https://ffmpeg.org
fftw/3.3.10-cuda-11 fftw/3.3.10-cuda-12	FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST). We believe that FFTW, which is free software, should become the FFT library of choice for most applications.	https://www.fftw.org
fish/3.6.1	fish is a smart and user-friendly command line shell for OS X, Linux, and the rest of the family.	https://fishshell.com/
fleur/5.1-cuda-12	FLEUR (Full-potential Linearised augmented plane wave in EURope) is a code family for calculating groundstate as well as excited-state properties of solids within the context of density functional theory (DFT).	https://www.flapw.de/MaX-5.1
foam-extend/5.0-source	The Extend Project is a fork of the OpenFOAM open-source library for Computational Fluid Dynamics (CFD). This offering is not approved or endorsed by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark.	https://sourceforge.net/projects/foam-extend/
gatk/3.8.1 gatk/4.4.0.0	Genome Analysis Toolkit Variant Discovery in High-Throughput Sequencing Data	https://gatk.broadinstitute.org/hc/en-us
gaussian/16-C.02	Gaussian is a computer program for computational chemistry	https://gaussian.com/
gcc/11.4.0 gcc/11.4.0-cuda gcc/12.3.0 gcc/12.3.0-cuda gcc/13.2.0 gcc/14.2.0 gcc/9.5.0	The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Ada, and Go, as well as libraries for these languages.	https://gcc.gnu.org
gdal/3.7.3	GDAL: Geospatial Data Abstraction Library.	https://www.gdal.org/
gdb/8.1	GDB, the GNU Project debugger, allows you to see what is going on ‘inside’ another program while it executes – or what another program was doing at the moment it crashed.	https://www.gnu.org/software/gdb
getmutils/1.0	Utilities collection for GETM (https://getm.eu/)
git/2.42.0	Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.	https://git-scm.com
git-lfs/3.3.0	Git LFS is a system for managing and versioning large files in association with a Git repository. Instead of storing the large files within the Git repository as blobs, Git LFS stores special ‘pointer files’ in the repository, while storing the actual file contents on a Git LFS server.	https://git-lfs.github.com
globalarrays/5.8.2-cuda-12	Global Arrays (GA) is a Partitioned Global Address Space (PGAS) programming model.	https://hpc.pnl.gov/globalarrays/
gmake/4.4.1	GNU Make is a tool which controls the generation of executables and other non-source files of a program from the program’s source files.	https://www.gnu.org/software/make/
gmp/6.2.1	GMP is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating-point numbers.	https://gmplib.org
gnuplot/5.4.3	Gnuplot is a portable command-line driven graphing utility for Linux, OS/2, MS Windows, OSX, VMS, and many other platforms. The source code is copyrighted but freely distributed (i.e., you don’t have to pay for it). It was originally created to allow scientists and students to visualize mathematical functions and data interactively, but has grown to support many non-interactive uses such as web scripting. It is also used as a plotting engine by third-party applications like Octave. Gnuplot has been supported and under active development since 1986	http://www.gnuplot.info
go/1.21.3	The golang compiler and build environment	https://go.dev
grads/2.2.3-hdf5-1.10	The Grid Analysis and Display System (GrADS) is an interactive desktop tool that is used for easy access, manipulation, and visualization of earth science data. GrADS has two data models for handling gridded and station data. GrADS supports many data file formats, including binary (stream or sequential), GRIB (version 1 and 2), NetCDF, HDF (version 4 and 5), and BUFR (for station data).	http://cola.gmu.edu/grads/grads.php
gromacs/2022.5-plumed-cuda gromacs/2023-plumed-cuda gromacs/2023.3-cuda	GROMACS is a molecular dynamics package primarily designed for simulations of proteins, lipids and nucleic acids. It was originally developed in the Biophysical Chemistry department of University of Groningen, and is now maintained by contributors in universities and research centers across the world.	https://www.gromacs.org
gsl/2.7.1	The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. It is free software under the GNU General Public License. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.	https://www.gnu.org/software/gsl
hdf5/1.12.2-cuda-12 hdf5/1.14.3-cuda-12	HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of data types, and is designed for flexible and efficient I/O and for high volume and complex data.	https://portal.hdfgroup.org
hpl/2.3-cuda-12	HPL is a software package that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers. It can thus be regarded as a portable as well as freely available implementation of the High Performance Computing Linpack Benchmark.	https://www.netlib.org/benchmark/hpl/
igv/2.12.3	The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.	https://software.broadinstitute.org/software/igv/home
imagemagick/7.1.1-11	ImageMagick is a software suite to create, edit, compose, or convert bitmap images.	https://www.imagemagick.org
intel-oneapi-advisor/2023.2.0	Intel Advisor is a design and analysis tool for developing performant code. The tool supports C, C++, Fortran, SYCL, OpenMP, OpenCL code, and Python. It helps with the following: Performant CPU Code: Design your application for efficient threading, vectorization, and memory use. Efficient GPU Offload: Identify parts of the code that can be profitably offloaded. Optimize the code for compute and memory.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/advisor.html
intel-oneapi-compilers/2023.2.1	Intel oneAPI Compilers. Includes: icx, icpx, ifx, and ifort. Releases before 2024.0 include icc/icpc	https://software.intel.com/content/www/us/en/develop/tools/oneapi.html
intel-oneapi-dal/2023.2.0	Intel oneAPI Data Analytics Library (oneDAL) is a library that helps speed up big data analysis by providing highly optimized algorithmic building blocks for all stages of data analytics (preprocessing, transformation, analysis, modeling, validation, and decision making) in batch, online, and distributed processing modes of computation. The library optimizes data ingestion along with algorithmic computation to increase throughput and scalability.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html
intel-oneapi-dnn/2023.2.0	The Intel oneAPI Deep Neural Network Library (oneDNN) helps developers improve productivity and enhance the performance of their deep learning frameworks. It supports key data type formats, including 16 and 32-bit floating point, bfloat16, and 8-bit integers and implements rich operators, including convolution, matrix multiplication, pooling, batch normalization, activation functions, recurrent neural network (RNN) cells, and long short-term memory (LSTM) cells.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onednn.html
intel-oneapi-inspector/2023.2.0	Intel Inspector is a dynamic memory and threading error debugger for C, C++, and Fortran applications that run on Windows and Linux operating systems. Save money: locate the root cause of memory, threading, and persistence errors before you release. Save time: simplify the diagnosis of difficult errors by breaking into the debugger just before the error occurs. Save effort: use your normal debug or production build to catch and debug errors. Check all code, including third-party libraries with unavailable sources.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/inspector.html
intel-oneapi-mkl/2023.2.0-cuda-12	Intel oneAPI Math Kernel Library (Intel oneMKL; formerly Intel Math Kernel Library or Intel MKL), is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html
intel-oneapi-tbb/2021.10.0	Intel oneAPI Threading Building Blocks (oneTBB) is a flexible performance library that simplifies the work of adding parallelism to complex applications across accelerated architectures, even if you are not a threading expert.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onetbb.html
intel-oneapi-vtune/2023.2.0	Intel VTune Profiler is a profiler to optimize application performance, system performance, and system configuration for HPC, cloud, IoT, media, storage, and more. CPU, GPU, and FPGA: Tune the entire application’s performance–not just the accelerated portion. Multilingual: Profile SYCL, C, C++, C#, Fortran, OpenCL code, Python, Google Go programming language, Java, .NET, Assembly, or any combination of languages. System or Application: Get coarse-grained system data for an extended period or detailed results mapped to source code. Power: Optimize performance while avoiding power and thermal-related throttling.	https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html
iq-tree/2.2.2.7-cuda-12	IQ-TREE Efficient software for phylogenomic inference	http://www.iqtree.org
jags/4.3.0	JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation not wholly unlike BUGS	https://mcmc-jags.sourceforge.net/
jq/1.6	jq is a lightweight and flexible command-line JSON processor.	https://stedolan.github.io/jq/
julia/1.10.0 julia/1.9.4	The Julia Language: A fresh approach to technical computing This package installs the x86_64-linux-gnu version provided by Julia Computing	https://julialang.org/
kraken2/2.1.2	Kraken2 is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.	https://ccb.jhu.edu/software/kraken2/
lftp/4.9.2	LFTP is a sophisticated file transfer program supporting a number of network protocols (ftp, http, sftp, fish, torrent).	https://lftp.yar.ru/
libaec/1.0.6	Libaec provides fast lossless compression of 1 up to 32 bit wide signed or unsigned integers (samples). It implements Golomb-Rice compression method under the BSD license and includes a free drop-in replacement for the SZIP library.	https://gitlab.dkrz.de/k202009/libaec
libpng/1.6.39	libpng is the official PNG reference library.	http://www.libpng.org/pub/png/libpng.html
libtiff/4.5.1	LibTIFF - Tag Image File Format (TIFF) Library and Utilities.	http://www.simplesystems.org/libtiff/
libxml2/2.10.3	Libxml2 is the XML C parser and toolkit developed for the Gnome project (but usable outside of the Gnome platform), it is free software available under the MIT License.	https://gitlab.gnome.org/GNOME/libxml2/-/wikis/home
likwid/5.2.2-cuda-12	Likwid is a simple to install and use toolsuite of command line applications for performance oriented programmers. It works for Intel and AMD processors on the Linux operating system. This version uses the perf_event backend which reduces the feature set but allows user installs. See https://github.com/RRZE-HPC/likwid/wiki/TutorialLikwidPerf#feature-limitations for information.	https://hpc.fau.de/research/tools/likwid/
llvm/17.0.4-cuda-12	The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Despite its name, LLVM has little to do with traditional virtual machines, though it does provide helpful libraries that can be used to build them. The name ‘LLVM’ itself is not an acronym; it is the full name of the project.	https://llvm.org/
masurca/4.1.0-cuda-12 masurca/4.1.1-cuda-12	MaSuRCA is whole genome assembly software. It combines the efficiency of the de Bruijn graph and Overlap-Layout-Consensus (OLC) approaches.	https://www.genome.umd.edu/masurca.html
mercurial/6.4.5	Mercurial is a free, distributed source control management tool.	https://www.mercurial-scm.org
meson/1.2.2	Meson is a portable open source build system meant to be both extremely fast, and as user friendly as possible.	https://mesonbuild.com/
mhm2/2.2.0.0-cuda-12	MetaHipMer (MHM) is a de novo metagenome short-read assembler, which is written in UPC++, CUDA and HIP, and runs efficiently on both single servers and on multinode supercomputers, where it can scale up to coassemble terabase-sized metagenomes.	https://bitbucket.org/berkeleylab/mhm2/
micromamba/1.4.2	Mamba is a fast, robust, and cross-platform package manager (Miniconda alternative).	https://mamba.readthedocs.io/
miniconda3/22.11.1	The minimalist bootstrap toolset for conda and Python3.	https://docs.anaconda.com/miniconda/
miniforge3/4.8.3-4-Linux-x86_64	Miniforge3 is a minimal installer for conda specific to conda-forge.	https://github.com/conda-forge/miniforge
molden/6.7-cuda-12	A package for displaying Molecular Density from various Ab Initio packages	https://www.theochem.ru.nl/molden/
mono/6.12.0.122	Mono is a software platform designed to allow developers to easily create cross platform applications. It is an open source implementation of Microsoft’s .NET Framework based on the ECMA standards for C# and the Common Language Runtime.	https://www.mono-project.com/
mpc/1.3.1	Gnu Mpc is a C library for the arithmetic of complex numbers with arbitrarily high precision and correct rounding of the result.	https://www.multiprecision.org
mpfr/3.1.6 mpfr/4.2.0	The MPFR library is a C library for multiple-precision floating-point computations with correct rounding.	https://www.mpfr.org/
mpifileutils/0.11.1-cuda-12	mpiFileUtils is a suite of MPI-based tools to manage large datasets, which may vary from large directory trees to large files. High-performance computing users often generate large datasets with parallel applications that run with many processes (millions in some cases). However those users are then stuck with single-process tools like cp and rm to manage their datasets. This suite provides MPI-based tools to handle typical jobs like copy, remove, and compare for such datasets, providing speedups of up to 20-30x.	https://github.com/hpc/mpifileutils
mumps/5.2.0-cuda-12 mumps/5.5.1-cuda-11	MUMPS: a MUltifrontal Massively Parallel sparse direct Solver	https://mumps-solver.org
muscle5/5.1.0	MUSCLE is widely-used software for making multiple alignments of biological sequences.	https://drive5.com/muscle5/
must/1.9.0-cuda-12	MUST detects usage errors of the Message Passing Interface (MPI) and reports them to the user. As MPI calls are complex and usage errors common, this functionality is extremely helpful for application developers that want to develop correct MPI applications. This includes errors that already manifest: segmentation faults or incorrect results as well as many errors that are not visible to the application developer or do not manifest on a certain system or MPI implementation.	https://www.i12.rwth-aachen.de/go/id/nrbe
namd/2.14-smp namd/3.0-smp namd/3.0.1-smp	NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.	https://www.ks.uiuc.edu/Research/namd/
nco/5.1.6-cuda-12	The NCO toolkit manipulates and analyzes data stored in netCDF-accessible formats	https://nco.sourceforge.net/
ncview/2.1.9-cuda-12	Simple viewer for NetCDF files.	https://cirrus.ucsd.edu/ncview/
netcdf-c/4.9.2-cuda-12	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the C distribution.	https://www.unidata.ucar.edu/software/netcdf
netcdf-fortran/4.6.1-mpi	NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the Fortran distribution.	https://www.unidata.ucar.edu/software/netcdf
netgen/5.3.1-cuda-12	NETGEN is an automatic 3d tetrahedral mesh generator. It accepts input from constructive solid geometry (CSG) or boundary representation (BRep) from STL file format. The connection to a geometry kernel allows the handling of IGES and STEP files. NETGEN contains modules for mesh optimization and hierarchical mesh refinement.	https://ngsolve.org/
netlib-lapack/3.11.0	LAPACK version 3.X is a comprehensive FORTRAN library that does linear algebra operations including matrix inversions, least squared solutions to linear sets of equations, eigenvector analysis, singular value decomposition, etc. It is a very comprehensive and reputable package that has found extensive use in the scientific community.	https://www.netlib.org/lapack/
netlib-scalapack/2.2.0-aocl	ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines	https://www.netlib.org/scalapack/
nextflow/23.10.0	Data-driven computational pipelines.	https://www.nextflow.io
ninja/1.11.1	Ninja is a small build system with a focus on speed. It differs from other build systems in two major respects: it is designed to have its input files generated by a higher-level build system, and it is designed to run builds as fast as possible.	https://ninja-build.org/
nvhpc/23.9	The NVIDIA HPC SDK is a comprehensive suite of compilers, libraries and tools essential to maximizing developer productivity and the performance and portability of HPC applications. The NVIDIA HPC SDK C, C++, and Fortran compilers support GPU acceleration of HPC modeling and simulation applications with standard C++ and Fortran, OpenACC directives, and CUDA. GPU-accelerated math libraries maximize performance on common HPC algorithms, and optimized communications libraries enable standards-based multi-GPU and scalable systems programming. Performance profiling and debugging tools simplify porting and optimization of HPC applications.	https://developer.nvidia.com/hpc-sdk
ocl-icd/2.3.1	This package aims at creating an Open Source alternative to vendor specific OpenCL ICD loaders.	https://github.com/OCL-dev/ocl-icd
octave/8.2.0	GNU Octave is a high-level language, primarily intended for numerical computations.	https://www.gnu.org/software/octave/
openbabel/3.1.1-cuda-12	Open Babel is a chemical toolbox designed to speak the many languages of chemical data. It’s an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, solid-state materials, biochemistry, or related areas.	https://openbabel.org/docs/index.html
openblas/0.3.24	OpenBLAS: An optimized BLAS library	https://www.openblas.net
opencl-c-headers/2022.01.04	OpenCL (Open Computing Language) C header files	https://www.khronos.org/registry/OpenCL/
opencl-clhpp/2.0.16	C++ headers for OpenCL development	https://www.khronos.org/registry/OpenCL/
opencl-headers/3.0	Bundled OpenCL (Open Computing Language) header files	https://www.khronos.org/registry/OpenCL/
opencoarrays/2.10.1-cuda-12	OpenCoarrays is an open-source software project that produces an application binary interface (ABI) supporting coarray Fortran (CAF) compilers, an application programming interface (API) that supports users of non-CAF compilers, and an associated compiler wrapper and program launcher.	http://www.opencoarrays.org/
openfoam/2306-cuda-12 openfoam/2306-source	OpenFOAM is a GPL-open-source C++ CFD-toolbox. This offering is supported by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark. OpenCFD Ltd has been developing and releasing OpenFOAM since its debut in 2004.	https://www.openfoam.com/
openfoam-org/10-cuda-12 openfoam-org/10-source openfoam-org/6-cuda-12 openfoam-org/6-source openfoam-org/7-cuda-12 openfoam-org/7-source openfoam-org/8-cuda-12 openfoam-org/8-source	OpenFOAM is a GPL-open-source C++ CFD-toolbox. The openfoam.org release is managed by the OpenFOAM Foundation Ltd as a licensee of the OPENFOAM trademark. This offering is not approved or endorsed by OpenCFD Ltd, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM trademark.	https://www.openfoam.org/
openjdk/11.0.20.1_1 openjdk/17.0.8.1_1	The free and open-source java implementation	https://jdk.java.net
openmpi/4.1.6-cuda-11 openmpi/4.1.6-cuda-12	An open source Message Passing Interface implementation.	https://www.open-mpi.org
osu-micro-benchmarks/7.3-cuda-12	The Ohio MicroBenchmark suite is a collection of independent MPI message passing performance microbenchmarks developed and written at The Ohio State University. It includes traditional benchmarks and performance measures such as latency, bandwidth and host overhead and can be used for both traditional and GPU-enhanced nodes.	https://mvapich.cse.ohio-state.edu/benchmarks/
parallel/20220522	GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input.	https://www.gnu.org/software/parallel/
paraview/5.11.2-cuda-11	ParaView is an open-source, multi-platform data analysis and visualization application. This package includes the Catalyst in-situ library for versions 5.7 and greater, otherwise use the catalyst package.	https://www.paraview.org
pbmpi/1.9-cuda-12	A Bayesian software for phylogenetic reconstruction using mixture models	https://github.com/bayesiancook/pbmpi
perl/5.38.0	Perl 5 is a highly capable, feature-rich programming language with over 27 years of development.	https://www.perl.org
perl-list-moreutils/0.430	Provide the stuff missing in List::Util	https://metacpan.org/pod/List::MoreUtils
perl-uri/5.08	Uniform Resource Identifiers (absolute and relative)	https://metacpan.org/pod/URI
petsc/3.20.1-complex petsc/3.20.1-real	PETSc is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations.	https://petsc.org
pigz/2.7	A parallel implementation of gzip for modern multi-processor, multi-core machines.	https://zlib.net/pigz/
pocl/5.0-cuda-12	Portable Computing Language (pocl) is an open source implementation of the OpenCL standard which can be easily adapted for new targets and devices, both for homogeneous CPU and heterogeneous GPUs/accelerators.	https://portablecl.org
proj/9.2.1	PROJ is a generic coordinate transformation software, that transforms geospatial coordinates from one coordinate reference system (CRS) to another. This includes cartographic projections as well as geodetic transformations.	https://proj.org/
psi4/1.8.2-cuda-12	Psi4 is an open-source suite of ab initio quantum chemistry programs designed for efficient, high-accuracy simulations of a variety of molecular properties.	https://www.psicode.org/
py-mpi4py/3.1.4-cuda-12	This package provides Python bindings for the Message Passing Interface (MPI) standard. It is implemented on top of the MPI-1/MPI-2 specification and exposes an API which grounds on the standard MPI-2 C++ bindings.	https://pypi.org/project/mpi4py/
py-nvitop/1.4.0	An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.	https://nvitop.readthedocs.io/
py-pip/23.1.2	The PyPA recommended tool for installing Python packages.	https://pip.pypa.io/
py-reportseff/2.7.6	A python script for tabular display of slurm efficiency information.	https://github.com/troycomi/reportseff
python/3.10.13 python/3.11.6 python/3.9.18	The Python programming language.	https://www.python.org/
qt/5.15.11	Qt is a comprehensive cross-platform C++ application framework.	https://qt.io
quantum-espresso/6.7-cuda-12 quantum-espresso/7.2-cuda-12 quantum-espresso/7.3.1-cuda-12	Quantum ESPRESSO is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.	https://quantum-espresso.org
r/4.4.0	R is ‘GNU S’, a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc. Please consult the R project homepage for further information.	https://www.r-project.org
rclone/1.63.1	Rclone is a command line program to sync files and directories to and from various cloud storage providers	https://rclone.org
relion/4.0.1-cuda-12	RELION (for REgularised LIkelihood OptimisatioN, pronounce rely-on) is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).	https://www2.mrc-lmb.cam.ac.uk/relion
repeatmasker/4.1.5-cuda-12	RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.	https://www.repeatmasker.org
repeatmodeler/2.0.4-cuda-12	RepeatModeler is a de-novo repeat family identification and modeling package.	https://github.com/Dfam-consortium/RepeatModeler
revbayes/1.1.1-cuda-12	Bayesian phylogenetic inference using probabilistic graphical models and an interpreted language.	https://revbayes.github.io
rust/1.70.0	The Rust programming language toolchain.	https://www.rust-lang.org
samtools/1.17	SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format	https://www.htslib.org
scala/2.13.1	Scala is a general-purpose programming language providing support for functional programming and a strong static type system. Designed to be concise, many of Scala’s design decisions were designed to build from criticisms of Java.	https://www.scala-lang.org/
scalasca/2.6.1-cuda-12	Scalasca is a software tool that supports the performance optimization of parallel programs by measuring and analyzing their runtime behavior. The analysis identifies potential performance bottlenecks - in particular those concerning communication and synchronization - and offers guidance in exploring their causes.	https://www.scalasca.org
scorep/8.3-cuda-12	The Score-P measurement infrastructure is a highly scalable and easy-to-use tool suite for profiling, event tracing, and online analysis of HPC applications.	https://www.vi-hps.org/projects/score-p
siesta/4.0.2-cuda-12	SIESTA performs electronic structure calculations and ab initio molecular dynamics simulations of molecules and solids.	https://departments.icmab.es/leem/siesta/
skopeo/0.1.40	skopeo is a command line utility that performs various operations on container images and image repositories.	https://github.com/containers/skopeo
slepc/3.20.0-cuda-11	Scalable Library for Eigenvalue Problem Computations.	https://slepc.upv.es
snakemake/7.22.0	Snakemake is an MIT-licensed workflow management system.	https://snakemake.readthedocs.io/en/stable/
spack/0.21.2	Spack is a multi-platform package manager that builds and installs multiple versions and configurations of software. It works on Linux, macOS, and many supercomputers. Spack is non-destructive: installing a new version of a package does not break existing installations, so many configurations of the same package can coexist.	https://spack.io/
spark/3.1.1	Apache Spark is a fast and general engine for large-scale data processing.	https://spark.apache.org
sqlite/3.43.2	SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine.	https://www.sqlite.org
squashfuse/0.5.0	squashfuse - Mount SquashFS archives using Filesystem in USErspace (FUSE)	https://github.com/vasi/squashfuse
starccm/18.06.007 starccm/19.04.007	STAR-CCM+: Simcenter STAR-CCM+ is a multiphysics computational fluid dynamics (CFD) simulation software that enables CFD engineers to model the complexity and explore the possibilities of products operating under real-world conditions.	https://plm.sw.siemens.com/en-US/simcenter/fluids-thermal-simulation/star-ccm/
subread/2.0.6	The Subread software package is a tool kit for processing next-gen sequencing data.	https://subread.sourceforge.net/
subversion/1.14.2	Apache Subversion - an open source version control system.	https://subversion.apache.org/
tcl/8.5.19 tcl/8.6.12	Tcl (Tool Command Language) is a very powerful but easy to learn dynamic programming language, suitable for a very wide range of uses, including web and desktop applications, networking, administration, testing and many more. Open source and business-friendly, Tcl is a mature yet evolving language that is truly cross platform, easily deployed and highly extensible.	https://www.tcl.tk/
tk/8.6.11	Tk is a graphical user interface toolkit that takes developing desktop applications to a higher level than conventional approaches. Tk is the standard GUI not only for Tcl, but for many other dynamic languages, and can produce rich, native applications that run unchanged across Windows, Mac OS X, Linux and more.	https://www.tcl.tk
tkdiff/5.7	TkDiff is a graphical front end to the diff program. It provides a side-by-side view of the differences between two text files, along with several innovative features such as diff bookmarks, a graphical map of differences for quick navigation, and a facility for slicing diff regions to achieve exactly the merge output desired.	https://tkdiff.sourceforge.io/
transabyss/2.0.1-cuda-12	de novo assembly of RNA-Seq data using ABySS
turbomole/7.6 turbomole/7.8.1	TURBOMOLE: Program Package for ab initio Electronic Structure Calculations.	https://www.turbomole.org/
udunits/2.2.28	Automated units conversion	https://www.unidata.ucar.edu/software/udunits
valgrind/3.20.0-cuda-12	An instrumentation framework for building dynamic analysis.	https://valgrind.org/
vampir/10.4.1-cuda-12	Vampir and Score-P provide a performance tool framework with special focus on highly-parallel applications. Performance data is collected from multi-process (MPI, SHMEM), thread-parallel (OpenMP, Pthreads), as well as accelerator-based paradigms (CUDA, HIP, OpenCL, OpenACC).	https://vampir.eu/
vampirserver/10.4.1	Vampir and Score-P provide a performance tool framework with special focus on highly-parallel applications. Performance data is collected from multi-process (MPI, SHMEM), thread-parallel (OpenMP, Pthreads), as well as accelerator-based paradigms (CUDA, HIP, OpenCL, OpenACC).	https://vampir.eu/
vim/9.0.0045	Vim is a highly configurable text editor built to enable efficient text editing. It is an improved version of the vi editor distributed with most UNIX systems. Vim is often called a ‘programmer’s editor,’ and so useful for programming that many consider it an entire IDE. It’s not just for programmers, though. Vim is perfect for all kinds of text editing, from composing email to editing configuration files.	https://www.vim.org
vmd/1.9.3-cuda-12	VMD provides user-editable materials which can be applied to molecular geometry.	https://www.ks.uiuc.edu/Research/vmd/
xtb/6.6.0	Semiempirical extended tight binding program package	https://xtb-docs.readthedocs.org
xz/5.4.1	XZ Utils is free general-purpose data compression software with high compression ratio. XZ Utils were written for POSIX-like systems, but also work on some not-so-POSIX systems. XZ Utils are the successor to LZMA Utils.	https://tukaani.org/xz/
zlib-ng/2.1.4	zlib replacement with optimizations for next generation systems.	https://github.com/zlib-ng/zlib-ng
zsh/5.8	Zsh is a shell designed for interactive use, although it is also a powerful scripting language. Many of the useful features of bash, ksh, and tcsh were incorporated into zsh; many original features were added.	https://www.zsh.org
zstd/1.5.5	Zstandard, or zstd as short version, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios.	https://facebook.github.io/zstd/

SCC Modules (scc-lmod)

Warning

This software revision is no longer updated but can still be loaded and used by running the following command before any module commands:

export PREFERRED_SOFTWARE_STACK=scc-lmod
source /etc/profile

This software stack used to be the default on the SCC part of the HPC system until May 2025. This stack uses Lmod as its module system. For the purposes of setting the desired software stack (see Software Stacks), its short name is scc-lmod. You can learn more about how to use the module system at Module Basics.

To see the available software, run

module avail

More information on this software stack can be found at GWDG Docs.

Adding Your Own Modules

See Using Your Own Module Files.

Spack

Spack is provided as the spack-user module to help build your own software.

Revisions

Several versions of the module system are provided, called revisions, so it is possible to use modules as they were in the past for compatibility.

module load rev/VERSION

where VERSION is the module revision version to use. For the revisions, VERSION takes the format YY.MM representing the year (last two digits) and month the revision was made.

To see the available revisions, run the command module avail rev like

gwdu101:23 08:20:08 ~ > module avail rev

------------------------- Previous Software Revisions --------------------------
   rev/11.06    rev/20.12    rev/21.12    rev/23.12 (D)

-------------------------------- BIOINFORMATICS --------------------------------
   revbayes/1.0.11

  Where:
   D:  Default Module

Module defaults are chosen based on Find First Rules due to Name/Version/Version modules found in the module tree.
See https://lmod.readthedocs.io/en/latest/060_locating.html for details.

If the avail list is too long consider trying:

"module --default avail" or "ml -d av" to just list the default modules.
"module overview" or "ml ov" to display the number of modules for each name.

Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".

and you can see which revisions are available, as well as another module whose name begins with “rev”.

HLRN Modules (hlrn-tmod)

The old software stack of the NHR cluster (is no longer the default). This software stack dates back to HLRN and was built in collaboration with ZIB (Zuse Institute Berlin) with their Lise cluster (another NHR center). For the purposes of setting the desired software stack (see Software Stacks), its short name is hlrn-tmod. It is currently in maintenance mode.

This stack uses Environment Modules (Tmod) as its module system, which is also known as Tmod. You can learn more about how to use the module system at Module Basics.

Note

You can get a complete overview of the installed software with the command module available. (Read more)

Note

The documentation here, while meant for the hlrn-tmod software stack NHR cluster, still applies reasonably well to the same software packages in the other software stacks.

Adding Your Own Modules

See Using Your Own Module Files.

Chemistry

exciting — a full-potential all-electron code, employing linearized augmented planewaves (LAPW) plus local orbitals (lo) as basis set.
GPAW — a density functional theory Python code based on the projector-augmented wave method.
GROMACS — a versatile package to perform molecular dynamics for systems with hundreds to millions of particles.
NAMD — a parallel, object-oriented molecular dynamics code designed for high-performance simulations of large biomolecular systems using force fields.
Octopus — a software package for density-functional theory (DFT), and time-dependent density functional theory (TDDFT)
Quantum ESPRESSO — an integrated suite of codes for electronic structure calculations and materials modeling at the nanoscale, based on DFT, plane waves, and pseudopotentials.
RELION — REgularised LIkelihood OptimisatioN is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy(cryo-EM)
VASP — a first-principles code for electronic structure calculations and molecular dynamics simulations in materials science and engineering.
Wannier90 — A program that calculates maximally-localised Wannier functions.
ASE – The Atomic Simulation Environment – a set of tools and Python modules for setting up, manipulating, running, visualizing, and analyzing atomistic simulations.
LAMMPS – A parallel, classical potential molecular dynamics code for solid-state materials, soft matter and coarse-grained or mesoscopic systems.
NWChem – A general computational chemistry code with capabilities from classical molecular dynamics to highly correlated electronic structure methods, designed to run on massively parallel machines.
MOLPRO – A comprehensive system of ab initio programs for advanced molecular electronic structure calculations.
PLUMED – A tool for trajectory analysis and plugin for molecular dynamics codes for free energy calculations in molecular systems
CPMD – A plane wave / pseudopotential implementation of DFT, designed for massively parallel ab-initio molecular dynamics.
BandUP – Band unfolding for plane wave based electronic structure calculations.

Data manipulation, tools and libraries

AEC library — Adaptive Entropy Coding library
CDO — The Climate Data Operators
ECCODES — ECMWF application programming interface
HDF5 Libraries and Binaries — HDF5 - hierarchical data format
libtiff — A software package containing a library for reading and writing _Tag Image File Format_(TIFF), and a small collection of tools for simple manipulations of TIFF images
NCO — The NetCdf Operators
netCDF — Network Common Data Form
Octave — Add short Excerpt. This will be included in the Software list
pigz — A parallel implementation of gzip for modern multi-processor, multi-core machine
PROJ — Cartographic Projections Library
R — R - statistical computing and graphics
Szip — Szip, fast and lossless compression of scientific data
UDUNITS2 — Unidata UDUNITS2 Package, Conversion and manipulation of units
Boost – Boost C++ libraries
CGAL – The Computational Geometry Algorithms Library

Development tools, compilers, translators, languages, performance analysis

antlr — Another Tool for Language Recognition
Arm DDT — Parallel debugger, including MPI/OpenMP programs.
Charm++ — Parallel Programming Framework
Intel oneAPI Performance Tools — VTune, APS, Advisor, Inspector, Trace Analyzer and Collector
Julia — A high-level, high-performance, dynamic programming language
LIKWID Performance Tool Suite — Performance tools and library for GNU Linux operating system.
Patchelf — a simple utility for modifying existing ELF executables and libraries.
Valgrind instrumentation framework — an instrumentation framework for building dynamic analysis tools
VS Code — VS Code is an IDE that while not provided on the clusters, many users use on their own machines and connect into the clusters with.
CMake – CMake build environment
GCC – GNU Compiler Collection for C, C++, Fortran, Go, Objc, Objc++ and Lto
Intel oneAPI Performance Libraries – Integrated Performance Primitives (IPP), Collective Communications Library (CCL), Data Analytics Library (DAL), Deep Neural Network Library (DNNL), DPC++ Library (DPL), Math Kernel Library (MKL), Threading Building Blocks (TBB), Video Processing Library (VPL) (included in “intel” environment modules)

Engineering

Abaqus — A Finite Element Analysis Package for Engineering Application
How to bring your own license
STAR-CCM+ — A Package for Computational Fluid Dynamics Simulations

Miscellaneous

libcurl — curl - a tool for transferring data from or to a server
libz — A Massively Spiffy Yet Delicately Unobtrusive Compression Library
nocache — nocache - minimize caching effects in lustre filesystems
texlive – LaTeX distribution, typesetting system
git – A fast, scalable, distributed revision control system

Numerics

BLAS — BLAS (Basic Linear Algebra Subprograms)
FFTW3 — A C-subroutine library for computing discrete Fourier transforms
GSL — The GNU Scientific Library (GSL)- a numerical library for C and C++ programmers
MUMPS — MUltifrontal Massively Parallel sparse direct Solver.
NFFT — Discrete Fourier transform (DFT) in one or more dimensions
ScaLAPACK — Scalable LAPACK
Scotch — Software package and libraries for sequential and parallel graph partitioning, static mapping, sparse matrix block ordering, and sequential mesh and hypergraph partitioning.
METIS – A set of serial programs for partitioning graphs, partitioning finite element meshes, and producing fill reducing orderings for sparse matrices.
ParMETIS – An MPI-based parallel library that implements a variety of algorithms for partitioning unstructured graphs, meshes, and for computing fill-reducing orderings of sparse matrices.
PETSc – Portable, Extensible Toolkit for Scientific Computation: widely used parallel numerical software library for partial differential equations and sparse matrix computations.

Visualization

GraDS — An interactive desktop tool for easy access, manipulation, and visualization of earth science data
NCL
NcView — Ncview - a visual browser for netCDF formated files.
pyfesom2 — Python library and tools for handling of FESOM2 ocean model output

Devtools Compiler Debugger

antlr — Another Tool for Language Recognition
Arm DDT — Parallel debugger, including MPI/OpenMP programs.
Charm++ — Parallel Programming Framework
Intel oneAPI Performance Tools — VTune, APS, Advisor, Inspector, Trace Analyzer and Collector
Julia — A high-level, high-performance, dynamic programming language
LIKWID Performance Tool Suite — Performance tools and library for GNU Linux operating system.
Patchelf — a simple utility for modifying existing ELF executables and libraries.
Valgrind instrumentation framework — an instrumentation framework for building dynamic analysis tools
VS Code — VS Code is an IDE that while not provided on the clusters, many users use on their own machines and connect into the clusters with.
CMake – CMake build environment
GCC – GNU Compiler Collection for C, C++, Fortran, Go, Objc, Objc++ and Lto
Intel oneAPI Performance Libraries – Integrated Performance Primitives (IPP), Collective Communications Library (CCL), Data Analytics Library (DAL), Deep Neural Network Library (DNNL), DPC++ Library (DPL), Math Kernel Library (MKL), Threading Building Blocks (TBB), Video Processing Library (VPL)
(included in “intel” environment modules)

antlr

Another Tool for Language Recognition

Description

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It’s widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees.

For HLRN-IV, only version 2.7.7 is installed. It is needed to install nco. The old downloads can be found here.

Versions

Version	Compiler
2.7.7	intel/19.0.3

Usage

module load antlr

enables the path to antlr.

Installation

The installation is carried out with autotools. The intel/19-compiler is used. The python interface is not built, since no python compatible with this compiler is available.

Arm DDT

Parallel debugger, including MPI/OpenMP programs.

Arm DDT (“DDT”) is a parallel debugger and part of the Arm Forge software suite. You can find more information at Arm DDT Get Started.

DDT is installed on HLRN IV and easiest used via a connection to a locally running GUI.

Download and install the Arm Forge client for Mac/Windows/Linux for the same minor version as installed on the cluster (currently 20.1) at https://developer.arm.com/tools-and-software/server-and-hpc/downloads/arm-forge
Open the installed program and adapt the settings for the HLRN installation: Click on Configure..., next click Add, then add a connection name: HLRN-IV Berlin or Göttingen, put the hostname of a login node which you can reach via your ssh setup, set the remote installation directory to /sw/tools/allinea/forge-20.1.3
Adapt your job script to initiate a debugging session. We recommend to use less than 8 nodes and if possible the testing queues. Also recompile your code with debugging information enabled -g and disable optimizations -O0 to avoid reordering of instructions.
Load the module and add a workaround for recent slurm changes via
```
module load forge/20.1.3
export ALLINEA_DEBUG_SRUN_ARGS="%default% --oversubscribe"
```
(see https://developer.arm.com/documentation/101136/2101/Appendix/Known-issues/SLURM-support?lang=en )
in you job script and prefix your srun/mpirun call with
```
ddt --connect srun myapplication.x
```
Relaunch your local client, you should receive a “Reverse connection request”, which you accept. This starts your debugging session.

You can also debug non-MPI programs as follows:

Allocate nodes interactively (see Quickstart Guide)
Locally launch the Forge GUI, select remote host, but manual program launch
Press the help button in the “Waiting for you to start the job” dialog, this will show you the command to start your code on the node

Charm++

Parallel Programming Framework

General Information

Charm++ is a Parallel Programming Framework with adaptive runtime technique. Read more on the Charm++ home page.

For manual and user-guide visit the Charm++ documentation page.

Modules

Version	Installation Path	Module	Compiler	Comment
6.8.2	/sw/libraries/charm++/6.8.2/skl	charm++/6.8.2	gcc_8.2-openmpi_3.1.2	Gö
6.9.0	/sw/tools/charm++/6.9.0/skl	charm++/6.9.0	openmpi.3.1.5-gcc.9.2.0	B

Usage

Select the version and set the environment by loading the module file:

module load charm++/<version>

This sets the appropriate paths for using Charm++ Parallel Framework.

Compile the example code using Charm++, then request the parallel resources, and run the test

cd $EXAMPLE_PATH/charm++/jacobi2d-2d-decomposition
make
srun -p standard96:test -N2 --ntasks 10 --pty --interactive bash
./charmrun jacobi2d 10 10 +p10

Intel oneAPI Performance Tools

VTune, APS, Advisor, Inspector, Trace Analyzer and Collector
(for x86 based hardware: best for Intel, some AMD)

Tool	Module	Use case
VTune (← click for more) * Application Performance Snapshot * Profiler	`vtune/`	* high level - initial overview * low level - detailed performance analysis (e.g. hotspots, bottlenecks)
Advisor	`advisor/`	low level - threading and vectorization aid (e.g roofline analysis)
Inspector	`inspector/`	low level - memory & threading error checking (e.g. find leaks or data races)
Trace Analyzer and Collector	`itac/`	low level - identify MPI load imbalances and communication hotspots

VTune Profiler

Quick performance evaluation with VTune APS or detailed hotspot, memory, or threading analysis with VTune profiler.

First load the environment module:

module load vtune/XXXX

Intro:
Presentation
Getting started

Manuals:
Intel_APS.pdf
VTune_features_for_HPC.pdf

vtune -help

Run VTune via the command line interface

Run your application with VTune wrapper as follows (Intel docs):

mpirun -np 4 aps -collect hotspots advanced-hotspots ./path-to_your/app.exe args_of_your_app

# after completion, explore the results:
aps-report aps_result_*

mpirun -np 4 vtune –collect hotspots -result-dir vtune_hotspot ./path-to_your/app.exe args_of_your_app
 
# after completion, explore the results:
vtune -report summary -r vtune_*

Run VTune-GUI (not recommended)

vtune-gui

Run VTune-GUI remotely on your local browser (recommended)

ssh -L 127.0.0.1:55055:127.0.0.1:55055 glogin.hlrn.de
salloc -p standard96:test -t 01:00:00
ssh -L 127.0.0.1:55055:127.0.0.1:55055 $SLURM_NODELIST
module add intel/19.0.5 impi/2019.9 vtune/2022
vtune-backend --web-port=55055 --enable-server-profiling &

Open 127.0.0.1:55055 in your browser (allow security exception, set initial password).

In 1^st “Welcome” VTune tab (run MPI parallel Performance Snapshot):

Click: Configure Analysis
- Set application: /path-to-your-application/program.exe
- Check: Use app. dir. as work dir.
- In case of MPI parallelism, expand “Advanced”: keep defaults but paste the following wrapper script and check “Trace MPI”:
```
#!/bin/bash

# Run VTune collector (here with 4 MPI ranks)
mpirun -np 4 "$@"
```
Under HOW, run: Performance Snapshot. (After completion/result finalization a 2nd result tab opens automatically.)

In 2^nd “r0…” VTune tab (explore Performance Snapshot results):

Here you find several analysis results e.g. the HPC Perf. Characterization.
Under Performance Snapshot - depending on the snapshot outcome - VTune suggests (see % in the figure below) more detailed follow-up analysis types:

For example select/run a Hotspot analysis:

In 3^rd “r0…” VTune tab (Hotspot analysis):

Expand sub-tab Top-down Tree
- In Function Stack expand the “_start” function and expand further down to the “main” function (first with an entry in the source file column)
- In the source file column double-click on “filename.c” of the “main” function
In the new sub-tab “filename.c” scroll down to the line with maximal CPU Time: Total to find hotspots in the main function

To quit the debug session press “Exit” in the VTune “Hamburger Menu” (upper left symbol of three horizontal bars). Then close the browser page. Exit your compute node via CTRL+D and kill your interactive job:

squeue -l -u $USER
scancel <your-job-id>

VTune Profiler (extended workshop version)

Quick performance evaluation with VTune APS or detailed hotspot, memory, or threading analysis with VTune profiler.

First load the environment module:

module load vtune/XXXX

Intro:
Presentation
Getting started

Manuals:
Intel_APS.pdf
VTune_features_for_HPC.pdf

vtune -help

Run VTune via the command line interface

Run your application with VTune wrapper as follows (Intel docs):

mpirun -np 4 aps -collect hotspots advanced-hotspots ./path-to_your/app.exe args_of_your_app

# after completion, explore the results:
aps-report aps_result_*

mpirun -np 4 vtune –collect hotspots -result-dir vtune_hotspot ./path-to_your/app.exe args_of_your_app
 
# after completion, explore the results:
vtune -report summary -r vtune_*

Run VTune-GUI (not recommended)

vtune-gui

Run VTune-GUI remotely on your local browser (recommended)

ssh -L 127.0.0.1:55055:127.0.0.1:55055 glogin.hlrn.de
salloc -p standard96:test -t 01:00:00
ssh -L 127.0.0.1:55055:127.0.0.1:55055 $SLURM_NODELIST
module add intel/19.0.5 impi/2019.9 vtune/2022
vtune-backend --web-port=55055 --enable-server-profiling &

Open 127.0.0.1:55055 in your browser (allow security exception, set initial password).

In 1^st “Welcome” VTune tab (run MPI parallel Performance Snapshot):

Click: Configure Analysis
-> Set application: /path-to-your-application/program.exe
-> Check: Use app. dir. as work dir.
-> In case of MPI parallelism, expand “Advanced”: keep defaults but paste the following wrapper script and check “Trace MPI”:
```
#!/bin/bash

# Run VTune collector (here with 4 MPI ranks)
mpirun -np 4 "$@"
```
Under HOW, run: Performance Snapshot.
(After completion/result finalization a 2nd result tab opens automatically.)

In 2^nd “r0…” VTune tab (explore Performance Snapshot results):

-> Here you find several analysis results e.g. the HPC Perf. Characterization.
-> Under Performance Snapshot - depending on the snapshot outcome - VTune suggests (see % in the figure below) more detailed follow-up analysis types: ![VTune Profiler](/img/vtune_profiler.png) --> For example select/run a Hotspot analysis:

In 3^rd “r0…” VTune tab (Hotspot analysis):

-> Expand sub-tab **Top-down Tree**
--> In **Function Stack** expand the "_start" function and expand further down to the "main" function (first with an entry in the source file column)
--> In the source file column double-click on "filename.c" of the "main" function
-> In the new sub-tab "filename.c" scroll down to the line with maximal CPU Time: Total to find hotspots in the main function

squeue -l -u $USER
scancel <your-job-id>

Example

Perfomance optimization of a simple MPI parallel program

First, download the example c code:

/*
 Matthias Laeuter, Zuse Institute Berlin
 v00 24.03.2023
 Lewin Stein, Zuse Institute Berlin
 v01 26.04.2023

 Compile example: mpiicc -cc=icx -std=c99 prog.c
*/
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>
#include <mpi.h>
#include <math.h>

#define DIMENSION 1

int main(int argc, char *argv[])
{
  const int rep = 10;
  int size;
  int rank;
  int xbegin, xend;

  MPI_Init(NULL,NULL);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);

  // read input data
  int dofnum[DIMENSION];
  if( argc < 2 )
    {
      for(int i=0; i<DIMENSION; i++) dofnum[i] = 10000000;
    }
  else
    {
      assert( argc == DIMENSION + 1 );
      for(int i=0; i<DIMENSION; i++) dofnum[i] = atoi(argv[i+1]);  
    }
  const double dx = 1. / (double) dofnum[0];

  // domain decomposition
  xbegin = (dofnum[0] / size) * rank;
  xend = (dofnum[0] / size) * (rank+1);
  if( rank+1 == size ) xend = dofnum[0];

  // memory allocation
  const int ldofsize = xend-xbegin;
  double* xvec = malloc(ldofsize * sizeof(double));
  for(int d=0; d<ldofsize; d++) xvec[d] = 0.0;

  // integration
  size_t mem = 0;
  size_t flop = 0;
  const double start = MPI_Wtime();
  double gsum;
  for(int r=0; r<rep; r++) {
    for(int i=xbegin; i<xend; i++) {
      const double x = dx * i;
      const int loci = i-xbegin;
      xvec[loci] = sin(x);
    }

    double sum = 0.;
    for(int d=0; d<ldofsize; d++)
      sum += xvec[d] * dx;
    MPI_Reduce(&sum,&gsum,1,MPI_DOUBLE,MPI_SUM,0,MPI_COMM_WORLD);
  }
  const double reptime = MPI_Wtime() - start;
  const double deltatime = reptime / rep;

  mem += ldofsize * sizeof(double);
  flop += 2*ldofsize;

  // memory release
  free( xvec );

  // profiling
  double maxtime = 0.0;
  MPI_Reduce(&deltatime,&maxtime,1,MPI_DOUBLE,MPI_MAX,0,MPI_COMM_WORLD);
  const double locgb = ( (double) mem ) / 1024. / 1024. / 1024.;
  double gb = 0.0;
  MPI_Reduce(&locgb,&gb,1,MPI_DOUBLE,MPI_SUM,0,MPI_COMM_WORLD);
  const double locgflop = ( (double) flop ) / 1024. / 1024. / 1024.;
  double gflop = 0.0;
  MPI_Reduce(&locgflop,&gflop,1,MPI_DOUBLE,MPI_SUM,0,MPI_COMM_WORLD);

  // output
  if( rank == 0 ) {
    printf("ranks=%i\n",size);
    printf("int_0^1sinX=%.10f\n",1.0-cos(1.0));
    printf("integral=%.10f dofnum=%i ",gsum,dofnum[0]);
    printf("nt=%i T=%0.4e GB=%0.4e Gflops=%0.4e\n",size,maxtime,gb,gflop/maxtime);
  }

  MPI_Finalize();

  return 0;
}

# gnu compilation (lm flags links math library)
ml -impi -intel gcc/9.3.0 openmpi/gcc.9/4.1.4
mpicc -lm -g -O0 integration.c -o prog.bin
mpicc -lm -g -O2 integration.c -o progAVX.bin
 
# intel compilation
ml -openmpi -gcc intel/2022.2.1 impi/2021.7.1
mpiicc -cc=icx -std=c99 -g -O0 integration.c -o iprog.bin
mpiicc -cc=icx -std=c99 -g -O2 integration.c -o iprogAVX.bin
mpiicc -cc=icx -std=c99 -g -O2 -xHost integration.c -o iprogAVXxHost.bin

Test run with MPI_Wtime() timings:

export & module load (environment)	executed (on standard96)	reference (min. s)	reference (max. s)
`gcc/9.3.0 openmpi/gcc.9/4.1.4`	`mpirun -n 4 ./prog.bin`	0.072	0.075
	`mpirun -n 4 ./progAVX.bin`	0.040	0.043
`intel/2022.2.1 impi/2021.7.1`	`mpirun -n 4 ./iprog.bin`	0.034	0.039
`intel/2022.2.1 impi/2021.7.1` `I_MPI_DEBUG=3`	`mpirun -n 4 ./iprog.bin`	0.034	0.039
	`mpirun -n 4 ./iprogAVX.bin`	0.008	0.011
`intel/2022.2.1 impi/2021.7.1` `I_MPI_DEBUG=3` `SLURM_CPU_BIND=none` `I_MPI_PIN_DOMAIN=core`	`I_MPI_PIN_ORDER=scatter mpirun -n 4 ./iprogAVX.bin`	0.009	0.011
	`I_MPI_PIN_ORDER=compact mpirun -n 4 ./iprogAVX.bin`	0.009	0.009 (0.011)
	`I_MPI_PIN_ORDER=compact mpirun -n 4 ./iprogAVXxHost.bin`	0.0058	0.0060 (0.0067)
`intel/2022.2.1 impi/2021.7.1` `I_MPI_DEBUG=3`	`mpirun -n 4 ./iprogAVXxHost.bin`	0.0052	0.0066

The process to core pinning during MPI startup is printed because of I_MPI_DEBUG level ≥3. During runtime the core load can be watched live with htop.

VTune’s Performance Snapshot:

export & module load (environment)	executed (on standard96)	vectorization	threading (e.g. OpenMP)	memory bound
`intel/2022.2.1 impi/2021.7.1 vtune/2022`	`mpirun -n 4 ./iprog.bin`	0.0%	1.0%	33.5%
	`mpirun -n 4 ./iprogAVX.bin`	97.3%	0.9%	44.4%
	`mpirun -n 4 ./iprogAVXxHost.bin`	99.9%	0.8%	49.0%
`intel/2022.2.1 impi/2021.7.1 vtune/2022` `SLURM_CPU_BIND=none` `I_MPI_PIN_DOMAIN=core`	`I_MPI_PIN_ORDER=compact mpirun -n 4 ./iprogAVXxHost.bin`	99.9%	0.4%	6.3%

Julia

A high-level, high-performance, dynamic programming language

Description

Julia is a high-level, high-performance, dynamic programming language. While it is a general-purpose language and can be used to write any application, many of its features are well suited for numerical analysis and computational science. (Wikipedia)

Licensing Terms and Conditions

Julia is distributed under the MIT license. Julia is free for everyone to use and all source code is publicly available on GitHub.

Julia at HLRN

Modules

Currently ony version 1.7.2 is installed, so before starting Julia, load the corresponding module:

$ module load julia

Running Julia on the frontends

This is possible, but resources and runtime are limited. Be friendly to other users and work on the (shared) compute nodes!

Running Julia on the compute nodes

Allocate capacity in the batch system, and log onto the related node:

$ salloc -N 1 -p large96:shared
$ squeue --job <jobID>

The output of salloc shows your job ID. With squeue you see the node you are going to use. Login with X11-forwarding:

$ ssh -X <nodename>

Load a module file and work interactively as usual. When ready, free the resources:

$ scancel <jobID>

You may also use srun:

$ srun -v -p large96:shared --pty --interactive bash

Do not forget to free the resources when ready.

Julia packages

Package can be installed via Julia’s package manager in your local depot. But for some HPC relevant packages there are the following things to follow to make the packages run correctly on the HLRN-IV systems:

MPI.jl

impi/2018.5 (supported)

By default, MPI.jl will download and link against an own MPICH implementation. On the HLRN-IV systems, we advise using the Intel MPI implementation, as we have found some serious problems with the Open MPI implementation in conjunction with multithreading.

Therefore the Julia module set already some environment variables under the assumption that the impi/2018.5 module is used (both for MPI versions 0.19 or earlier and for the newer versions using the MPIPreferences system).

To add the MPI.jl package to your depot follow these steps:

$ module load impi/2018.5
$ module load julia 
$ julia -e 'using Pkg; Pkg.add("MPIPreferences"); using MPIPreferences; MPIPreferences.use_system_binary(); Pkg.add("MPI")'

You can test the the correct version is used via

$ julia -e 'using MPI; println(MPI.MPI_LIBRARY_VERSION_STRING)'

The result should be “Intel(R) MPI Library 2018 Update 5 for Linux* OS”

As Intel MPI comes with an own pinning policy, please add “export SLURM_CPU_BIND=none” to your batch scripts.

Other MPI implementations (unsupported)

There is no direct dependency to impi/2018.5 in Julia’s module file, so if needed it is possible to adjust the environment to a different configuration before building and loading the MPI.jl package. Please check to the MPI.jl documentation for details.

HDF5.jl

Also for the HDF5.jl package one of the HDF5 modules provided by the HLRN-IV system should be used. After loading an HDF5 module, copy the HDF5_ROOT environment variable to JULIA_HDF5_PATH:

$ export JULIA_HDF5_PATH=$HDF5_ROOT

LIKWID Performance Tool Suite

Performance tools and library for GNU Linux operating system.

Description

LIKWID is an easy to use yet powerful command line performance tool suite for the GNU/Linux operating system. While LIKWID supported only x86 processors at the beginning, it has has been ported to ARM (including Fujitsu A64FX) and POWER8/9 architectures as well as to NVIDIA GPGPUs.Read more on LIKWID developers’ page

Version	Installation Path	modulefile	compiler	sites	comment
5.2.0	/sw/tools/likwid/5.2.0	likwid/5.2.0	gcc/9.3.0	Emmy, Lise	using accessdaemon (requires “likwid” group
5.2.1	/sw/tools/likwid/5.2.1 likwid/5.2.1	gcc	Emmy, Lise	using perf_event API

For the user guide visit hpc.fau.de.

Prerequisites

To use LIKIWID, please send a request to nhr-support@gwdg.de to be included in the likwid user-group.

Modules

Selecting the version and loading the environment

Load the modulefile

$ module load likwid/5.2.0

This sets the appropriate paths for using LIKWID tool suite.

Example usage

blogin6:~ $ srun --pty -N1 -pstandard96:test /bin/bash -ls
bcn1021:~ $ module add likwid/5.2.0
Module for likwid 5.2.0 loaded.
...
bcn1021:~ $ likwid-perfctr -g CLOCK /bin/sleep 1
--------------------------------------------------------------------------------
CPU name:       Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz
CPU type:       Intel Cascadelake SP processor
CPU clock:      2.29 GHz
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Group 1: CLOCK
+-----------------------+---------+------------+-------------+
|         Event         | Counter | HWThread 0 | HWThread 96 |
+-----------------------+---------+------------+-------------+
|   INSTR_RETIRED_ANY   |  FIXC0  |       3665 |        8094 |
| CPU_CLK_UNHALTED_CORE |  FIXC1  |      21830 |       43400 |
|  CPU_CLK_UNHALTED_REF |  FIXC2  |      21528 |       47656 |
|     PWR_PKG_ENERGY    |   PWR0  |    47.1188 |           0 |
|      UNCORE_CLOCK     | UBOXFIX | 2403448642 |           0 |
+-----------------------+---------+------------+-------------+
...

Patchelf

a simple utility for modifying existing ELF executables and libraries.

Description

PatchELF is a simple utility for modifying existing ELF executables and libraries. In particular, it can do the following:

Change the dynamic loader (“ELF interpreter”) of executables
Change the RPATH of executables and libraries
Shrink the RPATH of executables and libraries

Read more in the The github repository or the patchelf man page

Modules

module load patchelf

Valgrind instrumentation framework

an instrumentation framework for building dynamic analysis tools

Description

The Valgrind distribution currently includes six production-quality tools: a memory error detector, two thread error detectors, a cache and branch-prediction profiler, a call-graph generating cache and branch-prediction profiler, and a heap profiler. It also includes three experimental tools: a stack/global array overrun detector, a second heap profiler that examines how heap blocks are used, and a SimPoint basic block vector generator

Prerequisites

No group membership or license is needed. Valgrind can be used by all HLRN users by default.

Modules

Selecting the version and loading the environment

Load the modulefile

$ module load valgrind/<version>

This sets the appropriate paths for using Valgrind framework.

Example usage

$ valgrind ls -l

VS Code

VS Code is an IDE that while not provided on the clusters, many users use on their own machines and connect into the clusters with.

Description

Visual Studio Code (VS Code) and the completely free software version VSCodium (relationship is like Chromium to Chrome) are a commonly used IDEs that many users use on their own machines for development and are capable of SSH-ing into other machines for remote operation, editing, and debugging. While neither are provided on the clusters, many users edit files, run their codes, and debug their codes on the clusters via VS Code or VSCodium run on their own machines. This page is here to point out how to do certain things and avoid certain pitfalls.

Modules

None. Users run it on their own machines.

Connecting to Singularity/Apptainer Containers

The following contains a guide how to actively develop in a Apptainer/Singularity container. See Apptainer for more information on the apptainer module. Both Singularity and Apptainer are largely compatible with each other, and in fact you run the container with the singularity command regardless of which module you use.

module load SINGULARITY_OR_APPTAINER/VERSION

This guide was contributed by our GPU users Anwai Archit and Arne Nix who kindly provided this documentation. It is lightly edited to fit the format of this page and fix a few typos. Any place you see “singularity”, you could replace it with “apptainer” if you use the apptainer module instead. It was written for the Grete GPU nodes of Emmy, but can be easily translated to other partitions/clusters (see GPU Usage for more information). Obviously, rename any directories and files as makes sense for your user name, the SIF container file you use, and the names of your files and directories.

Starting a Singularity Container

First we need to setup a singularity container and submit it to run on a GPU node. For me this is done by the following SBATCH script:

#SBATCH --job-name=anix_dev-0               # Name of the job
#SBATCH --ntasks=1                          # Number of tasks
#SBATCH --cpus-per-task=2                   # Number of CPU cores per task
#SBATCH --nodes=1                           # Ensure that all cores are on one machine
#SBATCH --time=0-01:00                      # Runtime in D-HH:MM
#SBATCH --mem-per-cpu=3000                  # Memory pool for all cores (see also --mem-per-cpu)
#SBATCH --output=logs/anix_dev-0.%j.out     # File to which STDOUT will be written
#SBATCH --error=logs/anix_dev-0.%j.err      # File to which STDERR will be written
#SBATCH --mail-type=ALL                     # Type of email notification- BEGIN,END,FAIL,ALL
#SBATCH --mail-user=None                    # Email to which notifications will be sent
#SBATCH -p gpu                              # Partition to submit to
#SBATCH -G 1                                # Number of requested GPUs
 
 
export SRCDIR=$HOME/src
export WORKDIR=$LOCAL_TMPDIR/$USER/$SLURM_JOB_ID
mkdir -p $WORKDIR
mkdir -p $WORKDIR/tmp_home
 
module load singularity
module load cuda/11.2
scontrol show job $SLURM_JOB_ID  # print some info
 
singularity instance start --nv --env-file xaug/.env --no-home --bind  $WORKDIR/tmp_home:$HOME,$HOME/.vscode-server:$HOME/.vscode-server,$SRCDIR:/src,$WORKDIR:/work xaug_image.sif anix_dev-0 /src/xaug/run_dev.sh
sleep infinity

Important here are four things:

We need to load cuda and singularity to have it available to our container.
We need to bind $HOME/.vscode-server to the same place in the container.
We need to remember the name of our container. In this case: anix_dev-0
We need to keep the script running in order to not loose the node. This is achieved by sleep infinity .

SSH Config to Connect to the Container

We want to connect to the container via ssh. For this, setup the following configuartion in ~/.ssh/config on your local machine.

Host hlrn
    User <your_username>
    HostName glogin.hlrn.de
    IdentityFile ~/.ssh/<your_key>
 
Host hlrn-*
    User <your_username>
    IdentityFile ~/.ssh/<your_key>
    Port 22
    ProxyCommand ssh $(echo %h | cut -d- -f1) nc $(echo %h | cut -d- -f2) %p
 
Host hlrn-*-singularity
    User <your_username>
    IdentityFile ~/.ssh/<your_key>
    RequestTTY force
    Port 22
    ProxyCommand ssh $(echo %h | cut -d- -f1) nc $(echo %h | cut -d- -f2) %p
    RemoteCommand module load singularity; singularity shell --env HTTP_PROXY="https://www-cache.gwdg.de:3128, HTTPS_PROXY="https://www-cache.gwdg.de:3128" instance://<container_name>

This enables three different connections from your local machine:

Connection to the login node: ssh hlrn
Connection to a compute node that we obtained through the scheduler, e.g. ssh hlrn-ggpu02
Connection to the singularity container running on a compute node, e.g. ssh hlrn-ggpu02-singularity

Connecting VS-Code to the Container

This follows mostly the tutorial here. Then add the following lines:

"remote.SSH.enableRemoteCommand": true,
"remote.SSH.useLocalServer": true,

Now remote connections should be possible. Before we can connect to the individual cluster nodes, we first need to initialize the vscode-server on the login nodes. For this we press Ctrl+Shift+P, enter Remote-SSH: Connect to Host and select hlrn . This should (after typing in the password of your private key) connect our VS-Code to the login node. At the same time the vscode-server is installed in your home directory on the cluster. Additionally, you should go into the extensions and install all extensions (e.g. python) that you need on the cluster. These two steps cannot be done on the compute nodes, so it is important to do it on the login node beforehand. Finally, we can close the connection to the login node and now connect to the compute node that we have the singularity container running on. This works in the same way as the connection to the login node, but instead of hlrn , we select hlrn-<your_node>-singularity.

Legacy Applications

Here is a list of all applications installed in our obsolete and outdated software stack hlrn_tmod.

Abaqus

Warning

Our Abaqus license will run out on April 30th, 2024. You will only be able to resume working with Abaqus products if you can bring your own license (see How to bring your own license). Alternatively, you might consider using other Finite Element Analysis (FEA) tools such as Mechanical or LS-DYNA from Ansys.

A Finite Element Analysis Package for Engineering Application

To see our provided versions type: module avail abaqus

ABAQUS 2019 is the default. ABAQUS 2018 is the first version with multi-node support. ABAQUS 2016 is the last version including Abaqus/CFD.

Documentation

Note

To access the official documentation (starting with version 2017) you can register for free at: https://www.3ds.com/support/

Conditions for Usage and Licensing

Access to and usage of the software is regionally limited:

Only users from Berlin (account names “be*”) can use the ZIB license on NHR@ZIB systems. This license is strictly limited to teaching and academic research for non-industry funded projects only. Usually, there are always sufficient licenses for Abaqus/Standard and Abaqus/Explicit command-line based jobs. You can check this yourself (just in case):
```
# on NHR@ZIB systems
lmutil lmstat -S -c 1700@10.241.101.140 | grep -e "ABAQUSLM:" -e "Users of abaqus" -e "Users of parallel" -e "Users of cae"
```
Users from other german states can use the software installed on HLRN but have to use their own license from their own license server (see How to bring your own license).

Example Jobscripts

The input file of the test case (Large Displacement Analysis of a linear beam in a plane) is: c2.inp

Distributed Memory Parallel Processing

#!/bin/bash
#SBATCH -t 00:10:00
#SBATCH --nodes=2 
#SBATCH --ntasks-per-node=48
#SBATCH -p standard96:test
#SBATCH --mail-type=ALL
#SBATCH --job-name=abaqus.c2
 
module load abaqus/2020
 
# host list:
echo "SLURM_NODELIST:  $SLURM_NODELIST"
create_abaqus_hostlist_for_slurm
# This command will create the file abaqus_v6.env for you.
# If abaqus_v6.env exists already in the case folder, it will append the line with the hostlist.
 
### ABAQUS parallel execution
abq2019 analysis job=c2 cpus=${SLURM_NTASKS} standard_parallel=all mp_mode=mpi interactive double
 
echo '#################### ABAQUS finished ############'

SLURM logs to: slurm-<your job id>.out

The log of the solver is written to: c2.msg

Warning

The small number of elements in this example does not allow to use 2x96 cores. Hence, 2x48 are utilized here. But typically, if there is sufficient memory per core, we recommend using all physical cores per node (such as, in the case of standard96: #SBATCH –ntasks-per-node=96). Please refer to Compute node partitions, to see the number of cores on your selected partition and machine (Lise, Emmy).

Single Node Processing

#!/bin/bash
#SBATCH -t 00:10:00
#SBATCH --nodes=1  ## 2016 and 2017 do not run on more than one node
#SBATCH --ntasks-per-node=96
#SBATCH -p standard96:test
#SBATCH --job-name=abaqus.c2
 
module load abaqus/2016
 
# host list:
echo "SLURM_NODELIST:  $SLURM_NODELIST"
create_abaqus_hostlist_for_slurm
# This command will create the file abaqus_v6.env for you.
# If abaqus_v6.env exists already in the case folder, it will append the line with the hostlist.
 
### ABAQUS parallel execution
abq2016 analysis job=c2 cpus=${SLURM_NTASKS} standard_parallel=all mp_mode=mpi interactive double
 
echo '#################### ABAQUS finished ############'

Abaqus CAE GUI - not recommended for supercomputer use!

If you cannot set up your case input files *.inp by other means you may start a CAE GUI as a last resort on our compute nodes. But be warned: to keep fast/small OS images on the compute node there is a minimal set of graphic drivers/libs only; X-window interactions involve high latency. If you comply with our license terms (discussed above) you can use one of our four CAE licenses. In this case, please always add

#SBATCH -L cae

to your job script. This ensures that the SLURM scheduler starts your job only if a CAE license is available.

srun -p standard96:test -L cae --x11 --pty bash
 
# wait for node allocation (a single node is the default), then run the following on the compute node
 
module load abaqus/2022
abaqus cae -mesa

AEC library

Adaptive Entropy Coding library

Description

Libaec provides fast lossless compression of 1 up to 32 bit wide signed or unsigned integers (samples). The library achieves best results for low entropy data as often encountered in space imaging instrument data or numerical model output from weather or climate simulations. The libraries replaces the Szip-library, where usage was limited by a copyright. Read more.

The library is used by the HDF5 library and the ecCodes tools.

Modules

Loading the module defines PATH to a binary aec for data compression of single files.See aec –help for details.

LD_RUN_PATH, LIBRARY_PATH and similar shell variables are defined to support linking the aec library. See details on available version with module avail aec.

Installation

After unpacking, autotools have to be enabled. The aec library is build for intel and gnu compilers.

Install AEC GNU

module load gcc/9.3.0

export COMPILER=gcc.9.3.0
export CC=gcc
export CXX=g++
export FC=gfortran

#export SYS=OS_15.3

parentdir="$(dirname "$(pwd)")"
export BUILDDIR=$parentdir
echo "building in "$BUILDDIR

export PREFIX=/sw/dataformats/aec/1.0.6/skl/$COMPILER
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa

export CFLAGS="  -O3 -fPIC"
export CXXFLAGS="-O3 -fPIC"
export FCFLAGS=" -O3 -fPIC"
export LDLAGS="-O3 -fPIC"
../libaec-v1.0.6/configure --prefix=$PREFIX --libdir=$PREFIX/lib64

echo "Press ENTER to run make";read aaa
make -j2
echo "Press ENTER to run make check";read aaa
make check 2>&1 | tee check.out
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

Install AEC Intel

module load intel/2022.2

export COMPILER=intel.22
export CC=icc
export CXX=icpc
export F77=ifort
export FC=ifort
export COMPOPT="-fPIC -O3 -qopt-zmm-usage=high -march=skylake-avx512 -xskylake-avx512 -mtune=skylake-avx512 -mcmodel=medium -fno-alias -align -fp-model precise -shared-intel"

parentdir="$(dirname "$(pwd)")"
export BUILDDIR=$parentdir
echo "building in "$BUILDDIR

export PREFIX=/sw/dataformats/aec/1.0.6/skl/$COMPILER
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa

export LD_RUN_PATH=$LIBRARY_PATH
export CFLAGS="  $COMPOPT -Wl,-rpath=$LD_RUN_PATH"
export CXXFLAGS="$COMPOPT -Wl,-rpath=$LD_RUN_PATH"
export FCFLAGS=" $COMPOPT -Wl,-rpath=$LD_RUN_PATH"
export LDLAGS="-O3 -fPIC"
../libaec-v1.0.6/configure --prefix=$PREFIX --libdir=$PREFIX/lib64

echo "Press ENTER to run make";read aaa
make -j2
echo "Press ENTER to run make check";read aaa
make check 2>&1 | tee check.out
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

BLAS

BLAS (Basic Linear Algebra Subprograms)

Description

The BLAS (Basic Linear Algebra Subprograms) are routines that provide standard building blocks for performing basic vector and matrix operations.

For more information visit BLAS home page.

BLAS is currently available in different modules in HLRN:

Intel MKL (via intel/mkl module)
OpenBLAS

Both provide highly optimized BLAS routines. Additionally there is a (slow) reference lapack in /usr/lib64

Modules

Version	Installation Path	modulefile	compiler
0.3.7	/sw/numerics/openblas/0.3.7/skl	openblas/gcc.7/0.3.7	gcc/7.5.0
0.3.7	/sw/numerics/openblas/0.3.7/skl	openblas/gcc.8/0.3.7	gcc/8.3.0
0.3.7	/sw/numerics/openblas/0.3.7/skl	openblas/gcc.9/0.3.7	gcc/9.2.0

CDO

The Climate Data Operators

General Information

Vendor: MPI Hamburg Installation Path: /sw/dataformats/cdo/< version >

| Version | build date | compiler | remark | | — | — | — | | 1.9.6 | 04/2019 | intel-19 | | | 1.9.8 | 08/2020 | intel-18 | | | 2.0.5 | 04/2022 | gcc-8 | experimental | | 2.2.1 | 07/2023 | gcc-9 | aec support, cdi |

Please find the manual in the installation directory and visit the vendors website for more information.

Usage of the CDO Tools at HLRN

Manuals

Please find the manual in the installation directory and visit the vendors website for more information.

Module-files and environment variables

To activate the package, issue

module load cdo

from the command line or put this line into your login shell.

To see the versions of the linked libraries issue

cdo --version

See the supported operations with

cdo --help

Installation hints

cdo is installed from sorce and linked with netcdf, eccodes and proj. The install script run_configure is found in the installation directory.

Chemistry

CP2K — A package for atomistic simulations of solid state, liquid, molecular, and biological systems offering a wide range of computational methods with the mixed Gaussian and plane waves approaches.
exciting — a full-potential all-electron code, employing linearized augmented planewaves (LAPW) plus local orbitals (lo) as basis set.
GPAW — a density functional theory Python code based on the projector-augmented wave method.
GROMACS — a versatile package to perform molecular dynamics for systems with hundreds to millions of particles.
NAMD — a parallel, object-oriented molecular dynamics code designed for high-performance simulations of large biomolecular systems using force fields.
Octopus — a software package for density-functional theory (DFT), and time-dependent density functional theory (TDDFT)
Quantum ESPRESSO — an integrated suite of codes for electronic structure calculations and materials modeling at the nanoscale, based on DFT, plane waves, and pseudopotentials.
RELION — REgularised LIkelihood OptimisatioN is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy(cryo-EM)
VASP — a first-principles code for electronic structure calculations and molecular dynamics simulations in materials science and engineering.
Wannier90 — A program that calculates maximally-localised Wannier functions.
ASE – The Atomic Simulation Environment – a set of tools and Python modules for setting up, manipulating, running, visualizing, and analyzing atomistic simulations.
LAMMPS – A parallel, classical potential molecular dynamics code for solid-state materials, soft matter and coarse-grained or mesoscopic systems.
NWChem – A general computational chemistry code with capabilities from classical molecular dynamics to highly correlated electronic structure methods, designed to run on massively parallel machines.
MOLPRO – A comprehensive system of ab initio programs for advanced molecular electronic structure calculations.
PLUMED – A tool for trajectory analysis and plugin for molecular dynamics codes for free energy calculations in molecular systems
CPMD – A plane wave / pseudopotential implementation of DFT, designed for massively parallel ab-initio molecular dynamics.
BandUP – Band unfolding for plane wave based electronic structure calculations.

CP2K

Description

CP2K is a package for atomistic simulations of solid state, liquid, molecular, and biological systems offering a wide range of computational methods with the mixed Gaussian and plane waves approaches.

More information about CP2K and the documentation are found on https://www.cp2k.org/

Availability

CP2K is freely available for all users under the GNU General Public License (GPL).

Modules

CP2K is an MPI-parallel application. Use mpirun when launching CP2K.

CP2K Version	Modulefile	Requirement	Support	CPU / GPU	Lise/Emmy
2022.2	cp2k/2022.2	intel/2021.2 (Lise) intel/2022.2 (Emmy)	libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl, sirius, libvori and libbqb	✅ / ❌	✅ / ✅
2023.1	cp2k/2023.1	intel/2021.2 (Lise) intel/2022.2 (Emmy)	Lise: libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl, sirius, libvori and libbqb. Emmy: libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl and sirius.	✅ / ❌	✅ / ✅
2023.1	cp2k/2023.1	openmpi/gcc.11/4.1.4 cuda/11.8	libint, fftw3, libxc, elpa, elpa_nvidia_gpu, scalapack, cosma, xsmm, dbcsr_acc, spglib, mkl, sirius, offload_cuda, spla_gemm, m_offloading, libvdwxc	❌ / ✅	✅ / ❌
2023.2	cp2k/2023.2	intel/2021.2 impi/2021.7.1	libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl, sirius, libvori and libbqb	✅ / ❌	✅ / ❌
2023.2	cp2k/2023.2	openmpi/gcc.11/4.1.4 cuda/11.8	libint, fftw3, libxc, elpa, elpa_nvidia_gpu, scalapack, cosma, xsmm, dbcsr_acc, spglib, mkl, sirius, offload_cuda, spla_gemm, m_offloading, libvdwxc	❌ / ✅	✅ / ❌

Remark: cp2k needs special attention when running on GPUs.

You need to check if, for your problem, a considerable acceleration is expected. E.g., for the following test cases, a performance degradation has been reported: https://www.cp2k.org/performance:piz-daint-h2o-64, https://www.cp2k.org/performance:piz-daint-h2o-64-ri-mp2, https://www.cp2k.org/performance:piz-daint-lih-hfx, https://www.cp2k.org/performance:piz-daint-fayalite-fist
GPU pinning is required (see the example of a job script below). Don’t forget to make executable the script that takes care of the GPU pinning. In the example, this is achieved with: chmod +x gpu_bind.sh

Using cp2k as a library

Starting from version 2023.2, cp2k has been compiled enabling the option that allows it to be used as a library: libcp2k.a can be found inside $CP2K_LIB_DIR. The header libcp2k.h is located in $CP2K_HEADER_DIR, and the module files (.mod), eventually needed by Fortran users, are in $CP2K_MOD_DIR.

For more details, please refer to the documentation.

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=cp2k
 
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
 
module load intel/2021.2 impi/2021.7.1 cp2k/2023.2
srun cp2k.psmp input > output

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=cp2k
 
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} 
 
# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close
 
# Binding MPI tasks
export I_MPI_PIN=yes
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN_CELL=core
 
module load intel/2021.2 impi/2021.7.1 cp2k/2023.2
mpirun cp2k.psmp input > output

#!/bin/bash
#SBATCH --partition=gpu-a100 
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=18
#SBATCH --job-name=cp2k
 
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}   
export OMP_PLACES=cores
export OMP_PROC_BIND=close
 
module load gcc/11.3.0 openmpi/gcc.11/4.1.4 cuda/11.8 cp2k/2023.2
 
# gpu_bind.sh (see the following script) should be placed inside the same directory where cp2k will be executed
# Don't forget to make gpu_bind.sh executable by running: chmod +x gpu_bind.sh
mpirun --bind-to core --map-by numa:PE=${SLURM_CPUS_PER_TASK} ./gpu_bind.sh cp2k.psmp input > output

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=cp2k
 
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
 
module load intel/2022.2 impi/2021.6 cp2k/2023.1
srun cp2k.psmp input > output

#!/bin/bash
export CUDA_VISIBLE_DEVICES=$OMPI_COMM_WORLD_LOCAL_RANK
$@

Depending on the problem size, it may happen that the code stops with a segmentation fault due to insufficient stack size or due to threads exceeding their stack space. To circumvent this, we recommend inserting in the jobscript:

export OMP_STACKSIZE=512M
ulimit -s unlimited

Exciting

Description

exciting is an ab initio code that implements density-functional theory (DFT), capable of reaching the precision of micro Hartree. As its name suggests, exciting has a strong focus on excited-state properties. Among its features are:

G0W0 approximation;
Solution to the Bethe-Salpeter equation (BSE), to compute optical properties;
Time-dependent DFT (TDDFT) in both frequency and time domains;
Density-functional perturbation theory for lattice vibrations.

exciting is an open-source code, released under the GPL license.

More information is found on the official website: https://exciting-code.org/

Modules

exciting is currently available only on Lise. The standard species files deployed with exciting are located in $EXCITING_SPECIES. If you wish to use a different set, please refer to the manual.

The most recent compiled version is neon, and it has been built using with the intel-oneapi compiler (v. 2021.2) and linked to Intel MKL (including FFTW). N.B.: exciting fluorine is also available.

The exciting module depends on impi/2021.7.1.

Example Jobscripts

#!/bin/bash
#SBATCH --time 12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=exciting
  
module load impi/2021.7.1
# Load exciting neon
# If you want to use fluorine, replace with exciting/009-fluorine
module load exciting/010-neon
  
# Set the number of OpenMP threads as given by the SLURM parameter "cpus-per-task"
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
   
# Adjust the maximum stack size of OpenMP threads
export OMP_STACKSIZE=512m
  
# Do not use the CPU binding provided by slurm
export SLURM_CPU_BIND=none
   
# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close
   
# Binding MPI tasks
export I_MPI_PIN=yes
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN_CELL=core
  
mpirun exciting

GPAW

Description

GPAW is a density functional theory Python code based on the projector-augmented wave method. Plane waves, real-space uniform grids, multi-grid methods and the finite-difference approximation, or atom-centered basis functions can be used for the description of wave functions. The code relies on the Atomic Simulation Environment (ASE).

GPAW documentation and other material can be found on the GPAW website.

The GPAW project is licensed under GNU GPLv3.

Prerequisites

GPAW needs Python3 and ASE for proper execution. At HLRN, corresponding environment modules (anaconda3 and ase, respectively) must be loaded first. For its MPI-parallel execution GPAW was linked against Intel-MPI 2019, so one of the impi/2019.* environment modules must also be loaded to provide the mpirun job starter.

Only members of the gpaw user group have access to GPAW installations provided by HLRN. To have their user ID included in this group, users can send a message to their consultant or to HLRN support.

Modules

The environment modules shown in the table below are available to include GPAW in the user’s shell environment. To see what is installed and what is the current default version of GPAW, an overview can be obtained by saying module avail gpaw.

GPAW version	GPAW modulefile	GPAW requirements
20.1.0	`gpaw/20.1.0 (Lise only)`	`anaconda3/2019.10`, `ase/3.19.1`, `impi/2019.*`

When a gpaw module has been loaded successfully, the command gpaw info can be used to show supported features of this GPAW installation.

Job Script Examples

For Intel Cascade Lake compute nodes – simple case of a GPAW job with 192 MPI tasks distributed over 2 nodes running 96 tasks each (Berlin only)

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -N 2
#SBATCH --tasks-per-node 96
 
module load anaconda3/2019.10
module load ase/3.19.1
module load impi/2019.9
module load gpaw/20.1.0
 
export SLURM_CPU_BIND=none
 
mpirun gpaw python myscript.py

GROMACS

Description

GROMACS is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers and fluid dynamics.

Strengths

GROMACS provides extremely high performance compared to all other programs.
GROMACS can make simultaneous use of both CPU and GPU available in a system. There are options to statically and dynamically balance the load between the different resources.
GROMACS is user-friendly, with topologies and parameter files written in clear text format.
Both run input files and trajectories are independent of hardware endian-ness, and can thus be read by any version GROMACS.
GROMACS comes with a large selection of flexible tools for trajectory analysis.
GROMACS can be run in parallel, using the standard MPI communication protocol.
GROMACS contains several state-of-the-art algorithms.
GROMACS is Free Software, available under the GNU Lesser General Public License (LGPL).

Weaknesses

GROMACS does not do to much further analysis to get very high simulation speed.
Sometimes it is challenging to get non-standard information about the simulated system.
Different versions sometimes have differences in default parameters/methods. Reproducing older version simulations with a newer version can be difficult.
Additional tools and utilities provided by GROMACS are sometimes not the top quality.

GPU support

GROMACS automatically uses any available GPUs. To achieve the best performance GROMACS uses both GPUs and CPUs in a reasonable balance.

QuickStart

Environment modules

The following versions have been installed:

Modules for running on CPUs

Version	Installation Path	modulefile	compiler	comment
2018.4	/sw/chem/gromacs/2018.4/skl/impi	gromacs/2018.4	intelmpi
2018.4	/sw/chem/gromacs/2018.4/skl/impi-plumed	gromacs/2018.4-plumed	intelmpi	with plumed
2019.6	/sw/chem/gromacs/2019.6/skl/impi	gromacs/2019.6	intelmpi
2019.6	/sw/chem/gromacs/2019.6/skl/impi-plumed	gromacs/2019.6-plumed	intelmpi	with plumed
2021.2	/sw/chem/gromacs/2021.2/skl/impi	gromacs/2021.2	intelmpi
2021.2	/sw/chem/gromacs/2021.2/skl/impi-plumed	gromacs/2021.2-plumed	intelmpi	with plumed
2022.5	/sw/chem/gromacs/2022.5/skl/impi	gromacs/2022.5	intelmpi
2022.5	/sw/chem/gromacs/2022.5/skl/impi-plumed	gromacs/2022.5-plumed	intelmpi	with plumed

Modules for running on GPUs

Version	Installation Path	modulefile	compiler	comment
2023.0	/sw/chem/gromacs/2023.0/a100/tmpi_gcc	gromacs/2023.0_tmpi

*Release notes can be found here.

These modules can be loaded by using a module load command. Note that Intel MPI module file should be loaded first:

module load impi/2019.5 gromacs/2019.6

This provides access to the binary gmx_mpi which can be used to run simulations with sub-commands as gmx_mpi mdrun

In order to run simulations MPI runner should be used:

mpirun gmx_mpi mdrun MDRUNARGUMENTS

In order to load the GPU enabled version (avaiable only on the bgn nodes):

Modules can be loaded by using a module load command. Note that the following module files should be loaded first:

module load gcc/11.3.0 intel/2023.0.0 cuda/11.8 gromacs/2023.0_tmpi

Submission script examples

Simple CPU job script

A simple case of a GROMACS job using a total of 640 CPU cores for 12 hours. The requested amount of cores in the example does not include all available cores on the allocated nodes. The job will execute 92 ranks on 3 nodes + 91 ranks on 4 nodes. You can use this example if you know the exact amount of required ranks you want to use.

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -n 640
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load gromacs/2019.6
 
mpirun gmx_mpi mdrun MDRUNARGUMENTS

Whole node CPU job script

In case you want to use all cores on the allocated nodes, there are another options of the batch system to request the amount of nodes and number of tasks. The example below will result in running 672 ranks.

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -N 7
#SBATCH --tasks-per-node 96
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load gromacs/2019.6
 
mpirun gmx_mpi mdrun MDRUNARGUMENTS

GPU job script

Following script using four thread-MPI ranks. One is dedicated to the long-range PME calculation. Using the -gputasks 0001 keyword: the first 3 threads offload their short-range non-bonded calculations to the GPU with ID 0, the 4th (PME) thread offloads its calculations to the GPU with ID 1.

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --partition=gpu-a100
#SBATCH --ntasks=72
 
export SLURM_CPU_BIND=none
 
module load gcc/11.3.0 intel/2023.0.0 cuda/11.8
module load gromacs/2023.0_tmpi
 
export GMX_GPU_DD_COMMS=true
export GMX_GPU_PME_PP_COMMS=true
 
OMP_NUM_THREADS=9
 
gmx mdrun -ntomp 9 -ntmpi 4 -nb gpu -pme gpu -npme 1 -gputasks 0001 OTHER MDRUNARGUMENTS

Whole node GPU job script

To setup a whole node GPU job use the -gputasks keyword.

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --partition=gpu-a100
#SBATCH --ntasks=72
 
export SLURM_CPU_BIND=none
 
module load gcc/11.3.0 intel/2023.0.0 cuda/11.8
module load gromacs/2023.0_tmpi
 
export GMX_GPU_DD_COMMS=true
export GMX_GPU_PME_PP_COMMS=true
 
OMP_NUM_THREADS=9
 
gmx mdrun -ntomp 9 -ntmpi 16 -gputasks 0000111122223333 MDRUNARGUMENTS

Note: Settings of the Thread-MPI ranks and OpenMP threads is for achieve optimal performance. The number of ranks should be a multiple of the number of sockets, and the number of cores per node should be a multiple of the number of threads per rank.

Gromacs-Plumed

PLUMED is an open-source, community-developed library that provides a wide range of different methods, such as enhanced-sampling algorithms, free-energy methods and tools to analyze the vast amounts of data produced by molecular dynamics (MD) simulations. PLUMED works together with some of the most popular MD engines.

Gromacs/20XX.X-plumed modules are versions have been patched with PLUMED’s modifications, and these versions are able to run meta-dynamics simulations.

Analyzing results

GROMACS Tools

GROMACS contains many tools that for analysing your results such as read trajectories (XTC, TNG or TRR format) as well as a coordinate file (GRO, PDB, TPR) and write plots in the XVG format. A list of commands with a short description can be find organised by topic at the official website.

VMD

VMD is a molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting, it is free of charge, and includes source code..

Python

Python packages, MDAnalysis and MDTraj, can read and write trajectory- and coordinate-files of GROMACS and both have a variety of used analysis functions. Both packages integrate well with Python’s data-science packages like NumPy, SciPy and Pandas, and with plotting libraries such as Matplotlib.

Usage tips

System preparation

Your tpr file (portable binary run input file) contains your initial structure, molecular topology and all of the simulation parameters. Tpr files are portable can be copied from one computer to another one, but you should always use the same version of mdrun and grompp. Mdrun is able to use tpr files that have been created with an older version of grompp, but this can cause unexpected results in your simulation.

Running simulations

Simulations often take longer time than the maximum walltime. By using mdrun with -maxh command will tell the program the requested walltime and GROMACS will finishes the simulation when reaching 99% of the walltime. At this time, mdrun creates a new checkpoint file and properly close all output files. Using this method, the simulation can be easily restarted from this checkpoint file.

mpirun gmx_mpi mdrun MDRUNARGUMENTS -maxh 24

Restarting simulations

In order to restart a simulation from checkpoint file you can use the same mdrun command as the original simulation and adding -cpi filename.cpt where the filename is the name of your most recent checkpoint file.

mpirun gmx_mpi mdrun MDRUNARGUMENTS -cpi filename.cpt

More detailed information can be find here.

Performance

GROMACS prints information about statistics and performance at the end of the md.log file which usually also contains helpful tips to further improve the performance. The performance of the simulation is usually given in ns/day (number if nanoseconds of MD-trajectories simulated within a day).

More information about performance of the simulations and “how to imporve perfomance” can be find here.

Special Performance Instructions for Emmy at GWDG

Turbo-boost has been mostly disabled on Emmy at GWDG (partitions medium40, large40, standard96, large96, and huge96) in order to save energy. However, this has a particularly strong performance impact on GROMACS in the range of 20-40%. Therefore, we recommend that GROMACS jobs be submitted requesting turbo-boost to be enabled with the –constraint=turbo_on option given to srun or sbatch.

Useful links

References

NAMD

Description

NAMD is a parallel, object-oriented molecular dynamics code designed for high-performance simulations of large biomolecular systems using force fields. The code was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign.

NAMD current documentation and other material can be found on the NAMD website.

Prerequisites

NAMD is distributed free of charge for non-commercial purposes only. Users need to agree to the NAMD license. This includes proper citation of the code in publications.

Only members of the namd user group have access to NAMD executables provided by HLRN. To have their user ID included in this group, users can send a message to their consultant or to NHR support.

Modules

The environment modules shown in the table below are available to include NAMD executables in the directory search path. To see what is installed and what is the current default version of NAMD at HLRN, a corresponding overview can be obtained by saying module avail namd.

NAMD is a parallel application. It is recommended to use mpirun as the job starter for NAMD at HLRN. An MPI module providing the mpirun command needs to be loaded ahead of the NAMD module.

NAMD version	NAMD modulefile	NAMD requirements
2.13	namd/2.13	impi/* (any version)

File I/O Considerations

During run time only few files are involved in NAMD’s I/O activities. As long as standard MD runs are carried out, this is unlikely to impose stress on the Lustre file system ($WORK) as long as one condition is met. Namely, file metadata operations (file stat, create, open, close, rename) should not occur at too short time intervals. First and foremost, this applies to the management of NAMD restart files. Instead of having a new set of restart files created several times per second, the NAMD input parameter restartfreq should be chosen such that they are written only every 5 minutes or in even longer intervals. For the case of NAMD replica-exchange runs the situation can be more severe. Here we already observed jobs where heavy metadata file I/O on the individual “colvars.state” files located in every replica’s subdirectory has overloaded our Lustre metadata servers resulting in a severe slowdown of the entire Lustre file system. Users are advised to set corresponding NAMD input parameters such that each replica performs metadata I/O on these files in intervals not shorter than really needed, or, where affordable, that these files are written only at the end of the run.

Job Script Examples

For Intel Skylake compute nodes (Göttingen only) – simple case of a NAMD job using a total of 200 CPU cores distributed over 5 nodes running 40 tasks each

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p medium40
#SBATCH -N 5
#SBATCH --tasks-per-node 40
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load namd/2.13
 
mpirun namd2 inputfile > outputfile

For Intel Cascade Lake compute nodes – simple case of a NAMD job using a total of 960 CPU cores distributed over 10 nodes running 96 tasks each

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -N 10
#SBATCH --tasks-per-node 96
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load namd/2.13
 
mpirun namd2 inputfile > outputfile

A set of input files for a small and short replica-exchange simulation is included with the NAMD installation. A description can be found in the NAMD User’s Guide. The following job script executes this replica-exchange simulation on 2 nodes using 8 replicas (24 tasks per replica)

#!/bin/bash
#SBATCH -t 0:20:00
#SBATCH -p standard96
#SBATCH -N 2
#SBATCH --tasks-per-node 96
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load namd/2.13
 
cp -r /sw/chem/namd/2.13/skl/lib/replica .
cd replica/example/
mkdir output
(cd output; mkdir 0 1 2 3 4 5 6 7)
 
mpirun namd2 +replicas 8 job0.conf +stdout output/%d/job0.%d.log

Octopus

Description

Octopus is an ab initio program that describes electrons quantum-mechanically within density-functional theory (DFT) and in its time-dependent form (TDDFT) for problems with a time evolution. Nuclei are described classically as point particles. Electron-nucleus interaction is described within the pseudopotential approximation.

Octopus is free software, released under the GPL license.

More information about the program and its usage can be found on https://www.octopus-code.org/

Modules

Octopus is currently available only on Lise. The standard pseudopotentials deployed with Octopus are located in $OCTOPUS_ROOT/share/octopus/pseudopotentials/PSF/. If you wish to use a different set, please refer to the manual.

The most recent compiled version is 12.1, and it has been built using with the intel-oneapi compiler (v. 2021.2) and linked to Intel MKL (including FFTW).

The octopus module depends on intel/2021.2 and impi/2021.7.1.

Example Jobscripts

Assuming that your input file inp is located within the directory where you are submitting the jobscript, and that the output is written to out, one example of jobscript is given below

#!/bin/bash
#SBATCH --time 12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=octopus
 
module load intel/2021.2 impi/2021.7.1 octopus/12.1
 
# Set the number of OpenMP threads as given by the SLURM parameter "cpus-per-task"
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
  
# Adjust the maximum stack size of OpenMP threads
export OMP_STACKSIZE=512m
 
# Do not use the CPU binding provided by slurm
export SLURM_CPU_BIND=none
  
# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close
  
# Binding MPI tasks
export I_MPI_PIN=yes
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN_CELL=core
 
mpirun octopus

Please, check carefully for your use cases the best parallelization strategies in terms of e. g. the number of MPI processes and OpenMP threads. Note that the variables ParStates, ParDomains and ParKPoints defined in the input file also impact the parallelization performance.

Quantum ESPRESSO

Description

Quantum ESPRESSO (QE) is an integrated suite of codes for electronic structure calculations and materials modeling at the nanoscale, based on DFT, plane waves, and pseudopotentials. QE is an open initiative, in collaboration with many groups world-wide, coordinated by the Quantum ESPRESSO Foundation.

Documentation and other material can be found on the QE website.

Prerequisites

QE is free software, released under the GNU General Public License (v2). Scientific work done using the QE code should contain citations of corresponding QE references.

Only members of the q_espr user group have access to QE executables provided by HLRN. To have their user ID included in this group, users can send a message to their consultant or to NHR support.

Modules

The environment modules shown in the table below are available to include executables of the QE distribution in the directory search path. To see what is installed and what is the current default version of QE at HLRN, a corresponding overview can be obtained by saying module avail qe.

QE is a hybrid MPI/OpenMP parallel application. It is recommended to use mpirun as the job starter for QE at HLRN. An MPI module providing the mpirun command needs to be loaded ahead of the QE module.

QE version	QE modulefile	QE requirements
6.4.1	qe/6.4.1	impi/* (any version)

Job Script Examples

For Intel Cascade Lake compute nodes – plain MPI case (no OpenMP threading) of a QE job using a total of 1152 CPU cores distributed over 12 nodes, 96 tasks each. Here 3 pools (nk=3) are created for k-point parallelization (384 tasks per k-point), 3D-FFT is performed using 8 task groups (48 tasks each, nt=8).

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -N 12
#SBATCH --tasks-per-node 96
  
module load impi/2018.5
module load qe/6.4.1
 
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=1
 
mpirun pw.x -nk 3 -nt 8 -i inputfile > outputfile

RELION

REgularised LIkelihood OptimisatioN is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy(cryo-EM)

Description

Modules

Version	Installation Path	modulefile	compiler	comment
3.0	/sw/chem/relion/3.0/skl/gcc/openmpi	relion/3.0-gcc	gcc.8.2-openmpi.3.1.2	GUI enabled
3.0-beta	/sw/chem/relion/3.0beta/skl	relion/3.0-beta	intel-impi-2018

To use RELION in HLRN

#Load the modulefile
module load relion/3.0-gcc
#Launching GUI
relion_maingui

VASP

Description

The Vienna Ab initio Simulation Package (VASP) is a first-principles code for electronic structure calculations and molecular dynamics simulations in materials science and engineering. It is based on plane wave basis sets combined with the projector-augmented wave method or pseudopotentials. VASP is maintained by the Computational Materials Physics Group at the University of Vienna.

More information is available on the VASP website and from the VASP wiki.

Usage Conditions

Access to VASP executables is restricted to users satisfying the following criteria. The user must

be member of a research group owning a VASP license,
be registered in Vienna as a VASP user of this research group,
employ VASP only for work on projects of this research group.

Only members of the groups vasp5_2 or vasp6 have access to VASP executables. To have their user ID included in these groups, users can ask their consultant or submit a support request. It is recommended that users make sure that they already got registered in Vienna beforehand as this will be verified. Users whose research group did not upgrade its VASP license to version 6.x cannot become member of the vasp6 group.

Modules

VASP is an MPI-parallel application. We recommend to use mpirun as the job starter for VASP. The environment module providing the mpirun command associated with a particular VASP installation needs to be loaded ahead of the environment module for VASP.

VASP Version	User Group	VASP Modulefile	MPI Requirement	CPU / GPU	Lise / Emmy
5.4.4 with patch 16052018	vasp5_2	vasp/5.4.4.p1	impi/2019.5	✅ / ❌	✅ / ✅
6.4.1	vasp6	vasp/6.4.1	impi/2021.7.1	✅ / ❌	✅ / ❌
6.4.1	vasp6	vasp/6.4.1	nvhpc-hpcx/23.1	❌ / ✅	✅ / ❌
6.4.2	vasp6	vasp/6.4.2	impi/2021.7.1	✅ / ❌	✅ / ❌

N.B.: VASP version 6.x has been compiled with support for OpenMP, HDF5, and Wannier90. The CPU versions additionally supports Libxc, and the version 6.4.2 includes the DFTD4 van-der-Waals functional as well.

Executables

Our installations of VASP comprise the regular executables (vasp_std, vasp_gam, vasp_ncl) and, optionally, community driven modifications to VASP as shown in the table below. They are available in the directory added to the PATH environment variable by one of the vasp environment modules.

Executable	Description
vasp_std	multiple k-points (formerly vasp_cd)
vasp_gam	Gamma-point only (formerly vasp_gamma_cd)
vasp_ncl	non-collinear calculations, spin-orbit coupling (formerly vasp)
vaspsol_[std	gam
vasptst_[std	gam
vasptstsol_[std	gam

N.B.: The VTST script collection is not available from the vasp environment modules. Instead, it is provided by the vtstscripts environment module(s).

Example Jobscripts

#!/bin/bash
#SBATCH --time 12:00:00
#SBATCH --nodes 2
#SBATCH --tasks-per-node 40
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load vasp/5.4.4.p1
 
mpirun vasp_std

#!/bin/bash
#SBATCH --time 12:00:00
#SBATCH --nodes 2
#SBATCH --tasks-per-node 96
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load vasp/5.4.4.p1
 
mpirun vasp_std

The following job script exemplifies how to run vasp 6.4.1 making use of OpenMP threads. Here, we have 2 OpenMP threads and 48 MPI tasks per node (the product of these 2 numbers should ideally be equal to the number of CPU cores per node).

In many cases, running VASP with parallelization over MPI alone already yields good performance. However, certain application cases can benefit from hybrid parallelization over MPI and OpenMP. A detailed discussion is found here. If you opt for hybrid parallelization, please pay attention to process pinning, as shown in the example below.

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=2
#SBATCH --tasks-per-node=48
#SBATCH --cpus-per-task=2
#SBATCH --partition=standard96
 
export SLURM_CPU_BIND=none
 
# Set the number of OpenMP threads as given by the SLURM parameter "cpus-per-task"
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
 
# Adjust the maximum stack size of OpenMP threads
export OMP_STACKSIZE=512m
 
# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close
 
# Binding MPI tasks
export I_MPI_PIN=yes
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN_CELL=core
 
module load impi/2021.7.1
module load vasp/6.4.1  
 
mpirun vasp_std

In the following example, we show a job script that will run on the Nvidia A100 GPU nodes (Berlin). Per default, VASP will use one GPU per MPI task. If you plan to use 4 GPUs per node, you need to set 4 MPI tasks per node. Then, set the number of OpenMP threads to 18 to speed up your calculation. This, however, also requires proper process pinning.

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=2
#SBATCH --tasks-per-node=4
#SBATCH --cpus-per-task=18
#SBATCH --partition=gpu-a100
 
# Set the number of OpenMP threads as given by the SLURM parameter "cpus-per-task"
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
 
# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close
 
# Avoid hcoll as MPI collective algorithm
export OMPI_MCA_coll="^hcoll"
 
# You may need to adjust this limit, depending on the case
export OMP_STACKSIZE=512m
 
module load nvhpc-hpcx/23.1
module load vasp/6.4.1 
 
# Carefully adjust ppr:2, if you don't use 4 MPI processes per node
mpirun --bind-to core --map-by ppr:2:socket:PE=${SLURM_CPUS_PER_TASK} vasp_std

Wannier90

Description

The version 2.1.0 of Wannier90 is available on Lise and Emmy. For the document, please access: https://www.wannier.org/

Prerequisites

Intel MPI: 2019 or newer.

Modules

The module wannier90/2.1.0 makes the following executables available for use: wannier90.x and postw90.x. Also, the library libwannier.a (inside $WANNIER90_ROOT) is available, and can be linked against a desired code.

As stated in the documentation, wannier90.x calculates the maximally-localised Wannier functions and is a serial executable. postw90.x can take the Wannier functions computed by wannier90.x and calculate several properties. postw90.x can be executed in parallel through MPI.

CP2K

Description

More information about CP2K and the documentation are found on https://www.cp2k.org/

Availability

CP2K is freely available for all users under the GNU General Public License (GPL).

Modules

CP2K is an MPI-parallel application. Use mpirun when launching CP2K.

CP2K Version	Modulefile	Requirement	Support	CPU / GPU	Lise/Emmy
2022.2	cp2k/2022.2	intel/2021.2 (Lise) intel/2022.2 (Emmy)	libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl, sirius, libvori and libbqb	✅ / ❌	✅ / ✅
2023.1	cp2k/2023.1	intel/2021.2 (Lise) intel/2022.2 (Emmy)	Lise: libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl, sirius, libvori and libbqb. Emmy: libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl and sirius.	✅ / ❌	✅ / ✅
2023.1	cp2k/2023.1	openmpi/gcc.11/4.1.4 cuda/11.8	libint, fftw3, libxc, elpa, elpa_nvidia_gpu, scalapack, cosma, xsmm, dbcsr_acc, spglib, mkl, sirius, offload_cuda, spla_gemm, m_offloading, libvdwxc	❌ / ✅	✅ / ❌
2023.2	cp2k/2023.2	intel/2021.2 impi/2021.7.1	libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl, sirius, libvori and libbqb	✅ / ❌	✅ / ❌
2023.2	cp2k/2023.2	openmpi/gcc.11/4.1.4 cuda/11.8	libint, fftw3, libxc, elpa, elpa_nvidia_gpu, scalapack, cosma, xsmm, dbcsr_acc, spglib, mkl, sirius, offload_cuda, spla_gemm, m_offloading, libvdwxc	❌ / ✅	✅ / ❌

Remark: cp2k needs special attention when running on GPUs.

You need to check if, for your problem, a considerable acceleration is expected. E.g., for the following test cases, a performance degradation has been reported: https://www.cp2k.org/performance:piz-daint-h2o-64, https://www.cp2k.org/performance:piz-daint-h2o-64-ri-mp2, https://www.cp2k.org/performance:piz-daint-lih-hfx, https://www.cp2k.org/performance:piz-daint-fayalite-fist
GPU pinning is required (see the example of a job script below). Don’t forget to make executable the script that takes care of the GPU pinning. In the example, this is achieved with: chmod +x gpu_bind.sh

Using cp2k as a library

For more details, please refer to the documentation.

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=cp2k
 
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
 
module load intel/2021.2 impi/2021.7.1 cp2k/2023.2
srun cp2k.psmp input > output

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=cp2k
 
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} 
 
# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close
 
# Binding MPI tasks
export I_MPI_PIN=yes
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN_CELL=core
 
module load intel/2021.2 impi/2021.7.1 cp2k/2023.2
mpirun cp2k.psmp input > output

#!/bin/bash
#SBATCH --partition=gpu-a100 
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=18
#SBATCH --job-name=cp2k
 
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}   
export OMP_PLACES=cores
export OMP_PROC_BIND=close
 
module load gcc/11.3.0 openmpi/gcc.11/4.1.4 cuda/11.8 cp2k/2023.2
 
# gpu_bind.sh (see the following script) should be placed inside the same directory where cp2k will be executed
# Don't forget to make gpu_bind.sh executable by running: chmod +x gpu_bind.sh
mpirun --bind-to core --map-by numa:PE=${SLURM_CPUS_PER_TASK} ./gpu_bind.sh cp2k.psmp input > output

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=cp2k
 
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
 
module load intel/2022.2 impi/2021.6 cp2k/2023.1
srun cp2k.psmp input > output

#!/bin/bash
export CUDA_VISIBLE_DEVICES=$OMPI_COMM_WORLD_LOCAL_RANK
$@

export OMP_STACKSIZE=512M
ulimit -s unlimited

Data Manipulation

AEC library — Adaptive Entropy Coding library
CDO — The Climate Data Operators
ECCODES — ECMWF application programming interface
HDF5 Libraries and Binaries — HDF5 - hierarchical data format
libtiff — A software package containing a library for reading and writing _Tag Image File Format_(TIFF), and a small collection of tools for simple manipulations of TIFF images
NCO — The NetCdf Operators
netCDF — Network Common Data Form
Octave — Add short Excerpt. This will be included in the Software list
pigz — A parallel implementation of gzip for modern multi-processor, multi-core machine
PROJ — Cartographic Projections Library
R — R - statistical computing and graphics
Szip — Szip, fast and lossless compression of scientific data
UDUNITS2 — Unidata UDUNITS2 Package, Conversion and manipulation of units
Boost – Boost C++ libraries
CGAL – The Computational Geometry Algorithms Library

AEC library

Adaptive Entropy Coding library

Description

The library is used by the HDF5 library and the ecCodes tools.

Modules

Loading the module defines PATH to a binary aec for data compression of single files.See aec –help for details.

LD_RUN_PATH, LIBRARY_PATH and similar shell variables are defined to support linking the aec library. See details on available version with module avail aec.

Installation

After unpacking, autotools have to be enabled. The aec library is build for intel and gnu compilers.

Install AEC GNU

module load gcc/9.3.0

export COMPILER=gcc.9.3.0
export CC=gcc
export CXX=g++
export FC=gfortran

#export SYS=OS_15.3

parentdir="$(dirname "$(pwd)")"
export BUILDDIR=$parentdir
echo "building in "$BUILDDIR

export PREFIX=/sw/dataformats/aec/1.0.6/skl/$COMPILER
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa

export CFLAGS="  -O3 -fPIC"
export CXXFLAGS="-O3 -fPIC"
export FCFLAGS=" -O3 -fPIC"
export LDLAGS="-O3 -fPIC"
../libaec-v1.0.6/configure --prefix=$PREFIX --libdir=$PREFIX/lib64

echo "Press ENTER to run make";read aaa
make -j2
echo "Press ENTER to run make check";read aaa
make check 2>&1 | tee check.out
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

Install AEC Intel

module load intel/2022.2

export COMPILER=intel.22
export CC=icc
export CXX=icpc
export F77=ifort
export FC=ifort
export COMPOPT="-fPIC -O3 -qopt-zmm-usage=high -march=skylake-avx512 -xskylake-avx512 -mtune=skylake-avx512 -mcmodel=medium -fno-alias -align -fp-model precise -shared-intel"

parentdir="$(dirname "$(pwd)")"
export BUILDDIR=$parentdir
echo "building in "$BUILDDIR

export PREFIX=/sw/dataformats/aec/1.0.6/skl/$COMPILER
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa

export LD_RUN_PATH=$LIBRARY_PATH
export CFLAGS="  $COMPOPT -Wl,-rpath=$LD_RUN_PATH"
export CXXFLAGS="$COMPOPT -Wl,-rpath=$LD_RUN_PATH"
export FCFLAGS=" $COMPOPT -Wl,-rpath=$LD_RUN_PATH"
export LDLAGS="-O3 -fPIC"
../libaec-v1.0.6/configure --prefix=$PREFIX --libdir=$PREFIX/lib64

echo "Press ENTER to run make";read aaa
make -j2
echo "Press ENTER to run make check";read aaa
make check 2>&1 | tee check.out
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

CDO

The Climate Data Operators

General Information

Vendor: MPI Hamburg Installation Path: /sw/dataformats/cdo/< version >

Please find the manual in the installation directory and visit the vendors website for more information.

Usage of the CDO Tools at HLRN

Manuals

Please find the manual in the installation directory and visit the vendors website for more information.

Module-files and environment variables

To activate the package, issue

module load cdo

from the command line or put this line into your login shell.

To see the versions of the linked libraries issue

cdo --version

See the supported operations with

cdo --help

Installation hints

cdo is installed from sorce and linked with netcdf, eccodes and proj. The install script run_configure is found in the installation directory.

ECCODES

ECMWF application programming interface

Description

ecCodes is a package developed by ECMWF which provides an application programming interface and a set of tools for decoding and encoding messages in the following formats:

WMO FM-92 GRIB edition 1 and edition 2
WMO FM-94 BUFR edition 3 and edition 4
WMO GTS abbreviated header (only decoding).

A useful set of command line tools provide quick access to the messages. C, Fortran 90 and Python interfaces provide access to the main ecCodes functionality.

ecCodes is an evolution of GRIB-API. It is designed to provide the user with a simple set of functions to access data from several formats with a key/value approach.

Version	Build Date	Installation Path	modulefile	compiler	libraries
2.12.0	12. apr.2019	/sw/dataformats/eccodes	yes	intel.18, intel.19	hdf5 1.10.5, netcdf 4.6.3
2.31.0	21. jul. 2023	/sw/dataformats/eccodes	yes	intel.22, gcc.9	hdf5 1.12.2, netcdf 4.9.1

Documentation

Detailed description is found at the ECCODES home page,
use option –help to see an overview of relevant options

Usage at HLRN

Modulefiles and environmental variables

load a module to activate the path to the binaries and set some other environment variables. Use module show for more details.
link the eccodes-library for processing GRIB and BUFR formatted data

Program start

The following binaries are provided:

bufr_compare bufr_filter codes_bufr_filter grib2ppm grib_filter grib_ls gts_copy metar_compare metar_ls bufr_compare_dir bufr_get codes_count grib_compare grib_get grib_merge gts_dump metar_copy tigge_accumulations bufr_copy bufr_index_build codes_info grib_copy grib_get_data grib_set gts_filter metar_dump tigge_check bufr_count bufr_ls codes_parser grib_count grib_histogram grib_to_netcdf gts_get metar_filter tigge_name bufr_dump bufr_set codes_split_file grib_dump grib_index_build gts_compare gts_ls metar_get tigge_split

HLRN specific installation

environment PATH: path to the binaries PKG_CONFIG_PATH: for usage with pkg-config LD_RUN_PATH: for setting rpath ECCODES_DIR, ECCODES_VERSION, ECCODES_INCLUDE, ECCODES_LIB, ECCODES_INCLUDE_DIR, ECCODES_LIB_DIR: for linking eccodes into other software. Note: these variables are recommended by ECMWF, but not standard. So do not expect them somewhere else.

run_cmake script for installation with cmake

Installing ECCODES

gcc intel

Install ecCodes with intel compilers

module load intel/2022.2
export CC=icc
export FC=ifort

module load cmake/3.26.4
module load netcdf/intel/4.9.1
 
mkdir build ; cd build
echo `pwd`
parentdir="$(dirname "$(pwd)")"
export PREFIX=$parentdir
echo "builing for "$PREFIX
echo "Press ENTER to run cmake";read aaa

export HDF5=/sw/dataformats/hdf5/intel.22/1.12.2/skl
#export HDF5_LIBRARIES=$HDF5/lib 
export HDF5_LIBRARIES="$HDF5/lib/libhdf5.a $HDF5/lib/libhdf5_hl.a"
export HDF5_INCLUDE_DIRS=$HDF5/include
export PATH=$HDF5:$PATH

export NETCDF=`nc-config --prefix`

FFLAGS="-O3 -fPIC -xCORE-AVX512 -qopt-zmm-usage=high"
CFLAGS="-O3 -fPIC -xCORE-AVX512 -qopt-zmm-usage=high" 

export CC=icc
export FC=ifort
export CXX=icpc

cmake \
  -DCMAKE_C_COMPILER="$CC" -DCMAKE_Fortran_COMPILER="$FC" \
  -DCMAKE_C_FLAGS="$CFLAGS" -DCMAKE_Fortran_FLAGS="$FFLAGS" \
  -DCMAKE_CXX_COMPILER="$CXX" -DCMAKE_CXX_FLAGS="$CFLAGS" \
  -DBUILD_SHARED_LIBS=BOTH \
  -DENABLE_MEMFS=ON \
  -DENABLE_PNG=ON \
  -DENABLE_JPG=ON \
  -DENABLE_AEC=ON -DAEC_DIR=/sw/dataformats/aec/1.0.6/skl/intel.19 \
  -DENABLE_FORTRAN=ON \
  -DENABLE_NETCDF=ON \
  -DENABLE_ECCODES_OMP_THREADS=ON \
  -DNETCDF_PATH=$NETCDF \
  -DENABLE_INSTALL_ECCODES_DEFINITIONS=ON \
  -DENABLE_INSTALL_ECCODES_SAMPLES=ON \
  -DCMAKE_INSTALL_PREFIX="$PREFIX" ../src

echo "Press ENTER to run make";read aaa
make -j8
echo "Press ENTER to run ctest";read aaa
ctest
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

Install ecCodes with gcc

module load gcc/9.3.0

module load cmake/3.26.4
module load netcdf/gcc.9/4.9.2
 
export COMPILER=gcc.9.3.0

rm -r build; mkdir build ; cd build
echo `pwd`
parentdir="$(dirname "$(pwd)")"
export PREFIX=$parentdir
echo "builing for "$PREFIX
echo "Press ENTER to run cmake";read aaa

export HDF5=/sw/dataformats/hdf5/$COMPILER/1.12.2/skl
#export HDF5_LIBRARIES=$HDF5/lib 
export HDF5_LIBRARIES="$HDF5/lib/libhdf5.a $HDF5/lib/libhdf5_hl.a"
export HDF5_INCLUDE_DIRS=$HDF5/include
export PATH=$HDF5:$PATH

export NETCDF=`nc-config --prefix`

FFLAGS="-O3 -fPIC -march=skylake-avx512 -Wl,-rpath=$LD_RUN_PATH"
CFLAGS="-O3 -fPIC -march=skylake-avx512 -Wl,-rpath=$LD_RUN_PATH" 

export CC=gcc
export CXX=g++
export FC=gfortran

cmake \
  -DCMAKE_C_COMPILER="$CC" -DCMAKE_Fortran_COMPILER="$FC" \
  -DCMAKE_C_FLAGS="$CFLAGS" -DCMAKE_Fortran_FLAGS="$FFLAGS" \
  -DCMAKE_CXX_COMPILER="$CXX" -DCMAKE_CXX_FLAGS="$CFLAGS" \
  -DBUILD_SHARED_LIBS=BOTH \
  -DENABLE_MEMFS=ON \
  -DENABLE_PNG=ON \
  -DENABLE_JPG=ON \
  -DENABLE_AEC=ON -DAEC_DIR=/sw/dataformats/aec/1.0.6/skl/intel.19 \
  -DENABLE_FORTRAN=ON \
  -DENABLE_NETCDF=ON \
  -DENABLE_ECCODES_OMP_THREADS=ON \
  -DNETCDF_PATH=$NETCDF \
  -DENABLE_INSTALL_ECCODES_DEFINITIONS=ON \
  -DENABLE_INSTALL_ECCODES_SAMPLES=ON \
  -DECCODES_INSTALL_EXTRA_TOOLS=ON \
  -DENABLE_ECCODES_OMP_THREADS=ON \
  -DCMAKE_INSTALL_PREFIX="$PREFIX" ../src

echo "Press ENTER to run make";read aaa
make -j8
echo "Press ENTER to run ctest";read aaa
ctest
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

HDF5 Libraries and Binaries

HDF5 - hierarchical data format

Documentation

HDF5 is a data model, library, and file format for storing and managing data. It can represent very complex data objects and a wide variety of metadata. For a documentation visit the HDF group support portal

Installed versions

We cannot support all combinations of hdf5, compiler and mpi. If none of the installed version works, please contact support for installation of missing versions.

Serial HDF-5

Version	Compiler	Module	API
1.10.5	intel/18.0.6	hdf5/intel/1.10.5	v110
1.10.5	gcc/7.5.0	hdf5/gcc.7/1.10.5	v110
1.10.5	gcc/8.3.0	hdf5/gcc.8/1.10.5	v110
1.10.6	gcc/8.3.0	hdf5/gcc.8/1.10.6	v110
1.10.7	gcc/9.3.0	hdf5/gcc.9/1.10.7	v110
1.12.1	intel/19.0.5	hdf5/intel/1.12.1	v112
1.12.2	gcc/8.3.0	hdf5/gcc.8/1.12.2	v112
1.12.2	gcc/9.3.0	hdf5/gcc.9/1.12.2	v112
1.12.2	intel/2022.2	hdf5/intel/1.12.2	v112

Parallel HDF-5

Version	Compiler, MPI	Module	API
1.10.5	intel/18.0.6, impi/2018.5	hdf5-parallel/impi/intel/1.10.5	v110
1.12.0	intel/19.1.2, impi/2019.8	hdf5-parallel/impi/intel/1.12.0	v112
1.10.6	intel/18.0.6, openmpi/intel/3.1.6	hdf5-parallel/ompi/intel/1.10.6	v110
1.10.5	gcc/8.3.0, openmpi/gcc.9/3.1.5	hdf5-parallel/ompi/gcc.8/1.10.5	v110
1.10.6	gcc/9.2.0, openmpi/gcc.9/3.1.5	hdf5-parallel/ompi/gcc.9/1.10.6	v110
1.12.1	gcc/8.3.0, openmpi/gcc.9/3.1.5	hdf5-parallel/ompi/gcc.8/1.12.0	v112

The libraries are threadsafe and can be used in omp-parallel codes. To see configure-details on the installed libraries, load a hdf5-module and issue cat $HDF5DIR/libhdf5.settings!

Modules and Prerequisites

module avail hdf5 shows all available hdf5 versions
module show <modulename> shows environment variables set when loading the module
module help <modulename> shows some details on compilers and MPI

Loading a module adds the path to some hdf5-binaries to $PATH. For convenience, some other environmental variables are extended or defined, which satisfies the needs of several software packages for linking the hdf5-libraries:

PATH, LD_RUN_PATH

C_INCLUDE_PATH, CPLUS_INCLUDE_PATH,

HDF5DIR, HDF5INCLUDE, HDF5LIB

HDF5_HOME, HDF5_ROOT

The module files do not require any prerequisites. All dynamically linked libraries are found, since RPATH is defined in libraries and binaries. For an example, consider

readelf -a $HDF5DIR/libhdf5_hl.so | grep RPATH

Linking the hdf5-libraries in user programs

Loading a hdf5-module does not link the libraries automatically, when compiling some other software. hdf5 does not deliver any standard way to detect the path to the libraries like pkg-config. So you have to configure Makefiles or install-scripts “by hand”. The environment variables exported by the modules should allow short and flexible notation.

hdf5-modules do not have any effect during the runtime of your binaries. If the hdf5-libraries are linked dynamically, it is recommended to set RPATH in the binary. Otherwise the hdf5-libraries (and others) will not be found, during runtime. This can be done with the compiler option -Wl,-rpath=$LD_RUN_PATH, which passes the rpath to the linker.

hdf5 compiler wrapper

The compiler wrapper h5cc, h5fc (parallel h5pcc and h5pfc) can be used to build programs linking the hdf5-libraries. These wrappers set rpath for all libraries to be linked dynamically for hdf5.

Installing the hdf5-libraries at HLRN

HDF5 Installation

This Page describes, how it was done, not how it should be done.

Remark

HDF5 is stored in the sw-area based on gpfs, but works usually on files in the lustre based scratch directories. Both file system types may have different models for treating file locking, which is basic for parallel usage of HDF5. Hence, it is strongly recommended to build hdf5 on a file system, where it should be used. All versions of HDF5 at HLRN are build on a lustre file system, but are stored in the sw area.

In the HDF5 forum the reason for failures of parallel test is discussed. For open-MPI it is claimed that the MPI standard is not fully supported. It seems that the errors in parallel tests are gone with opempi 5. However only a prerelease for the open-MPI version is available.

Prerequisites

the szip library. In recent versions the szip library is replaced by the aec library. It emulates szip.1.10.6
for parallel builds intel-MPI or openMPI.
all other ingredients (zlib) are part of the system.

Install libaec

enable for configure to avoid cmake ( read more )
install libaec to the same path, where HDF5 will reside. (script) The preinstalled linaec can also be used.

configure flags for version 1.10.

–with-pic
–enable-production
–enable-unsupported –enable-threadsafe
–enable-fortran –enable-fortran2003 –enable-cxx

configure flags for version 1.12.

–with-pic
–enable-production –enable-optimization=high
–enable-direct-vfd –enable-preadwrite
–enable-unsupported –enable-threadsafe
–enable-fortran –enable-fortran2003 –enable-cxx
–enable-file-locking –enable-recursive-rw-locks

Script for installing parallel HDF-5 1.12.2
Script for installing parallel HDF-5 1.12.2

Enable autotools

After unpacking the archive file, go to the new directory and issue the commands:

libtoolize --force
aclocal
autoheader
automake --force-missing --add-missing
autoconf

HDF5 1.12.2 configuration

Features:

                     Parallel HDF5: yes
  Parallel Filtered Dataset Writes: yes
                Large Parallel I/O: yes
                High-level library: yes
Dimension scales w/ new references: no
                  Build HDF5 Tests: yes
                  Build HDF5 Tools: yes
                      Threadsafety: yes
               Default API mapping: v112
    With deprecated public symbols: yes
            I/O filters (external): deflate(zlib),szip(encoder)
                               MPE: 
                     Map (H5M) API: no
                        Direct VFD: yes
                        Mirror VFD: yes
                (Read-Only) S3 VFD: no
              (Read-Only) HDFS VFD: no
                           dmalloc: no
    Packages w/ extra debug output: none
                       API tracing: no
              Using memory checker: no
   Memory allocation sanity checks: no
            Function stack tracing: no
                  Use file locking: yes
         Strict file format checks: no
      Optimization instrumentation: no
      
Press ENTER to run make

HDF5 1.12.2 parallel (impi) installation

export VER=1.12.2
rm -r hdf5-$VER
zcat hdf5-$VER.tar.gz | tar -xvf -

echo "unpacked hdf5-"$VER
echo "Press ENTER to continue";read aaa

module load intel/2022.2
module load impi/2021.6

export COMPILER=intel.22
export CC=mpiicc
export CXX=mpiicpc
export F77=mpiifort
export FC=mpiifort

export COMPOPT="-fPIC -O3 -qopt-zmm-usage=high -march=skylake-avx512 -xskylake-avx512 -mtune=skylake-avx512 -mcmodel=medium -fargument-noalias-global -fp-model precise -shared-inte
l"
export COMPOPT="-fPIC -O2"

export SLURM_CPU_BIND=none   
export I_MPI_HYDRA_TOPOLIB=ipl
export I_MPI_HYDRA_BRANCH_COUNT=-1
export I_MPI_EXTRA_FILESYSTEM=1
export I_MPI_EXTRA_FILESYSTEM_FORCE=lustre

export BUILDDIR=hdf5-$VER
cd $BUILDDIR

parentdir="$(dirname "$(pwd)")"
echo "building in "$BUILDDIR

export PREFIX=/sw/dataformats/hdf5-parallel/impi.21/$COMPILER/$VER/skl
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa
export SZIP=$PREFIX
export LD_RUN_PATH=$PREFIX/lib:$LIBRARY_PATH

export CFLAGS="  $COMPOPT -align -Wl,-rpath=$LD_RUN_PATH"
export CXXFLAGS="$COMPOPT -align -Wl,-rpath=$LD_RUN_PATH"
export FCFLAGS=" $COMPOPT -assume nostd_value -align array64byte -Wl,-rpath=$LD_RUN_PATH"
#export LDLAGS="-O3 -fPIC"

#Set to true to make sure taht it is really switched off
export HDF5_USE_FILE_LOCKING=FALSE

../hdf5-$VER/configure --prefix=$PREFIX --with-szlib=$SZIP --with-pic \
         --enable-build-mode=production \
         --enable-parallel \
         --disable-file-locking
         --enable-direct-vfd \
#         --enable-file-locking --enable-recursive-rw-locks
#         --enable-direct-vfd --enable-mirror-vfd \
#          --enable-optimization=high \
#         --enable-threadsafe --enable-unsupported \

echo "Press ENTER to run make";read aaa
make -j8 | tee comp.out
echo "Press ENTER to run make check";read aaa
make -i check | tee check.out 2>&1  
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

HDF5 1.12.2 parallel (ompi) installation

module load intel/2022.2
module load openmpi/intel/4.1.4

export VER=1.12.2

export COMPILER=intel.22
export CC=mpicc
export CXX=mpicxx
export F77=mpifort
export FC=mpifort
export SLURM_CPU_BIND=none   
#export I_MPI_HYDRA_TOPOLIB=ipl
#export I_MPI_HYDRA_BRANCH_COUNT=-1
#export I_MPI_FABRICS=ofi
#export I_MPI_SHM_CELL_BWD_SIZE=2048000
 
export COMPOPT="-fPIC -O3 -qopt-zmm-usage=high -march=skylake-avx512 -xskylake-avx512 -mtune=skylake-avx512 -mcmodel=medium -fargument-noalias-global -fp-model precise -shared-intel"

export BUILDDIR=build_hdf5_ompi_$COMPILER
mkdir $BUILDDIR
cd $BUILDDIR

parentdir="$(dirname "$(pwd)")"
echo "building in "$BUILDDIR

export PREFIX=/sw/dataformats/hdf5-parallel/ompi/$COMPILER/$VER/skl
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa
export SZIP=$PREFIX
export LD_RUN_PATH=$PREFIX/lib:$LIBRARY_PATH

export CFLAGS="  $COMPOPT -align -Wl,-rpath=$LD_RUN_PATH"
export CXXFLAGS="$COMPOPT -align -Wl,-rpath=$LD_RUN_PATH"
export FCFLAGS=" $COMPOPT -align array64byte -Wl,-rpath=$LD_RUN_PATH"
#export LDLAGS="-O3 -fPIC"

#Set to true to make sure that it is really switched off
export HDF5_USE_FILE_LOCKING="FALSE"

../hdf5-$VER/configure --prefix=$PREFIX --with-szlib=$SZIP --with-pic \
         --enable-build-mode=production \
         --enable-optimization=high \
         --enable-parallel \
         --enable-threadsafe --enable-unsupported \
         --enable-direct-vfd --enable-mirror-vfd \
         --enable-fortran --enable-cxx \
         --disable-file-locking

#         --enable-file-locking --enable-recursive-rw-locks \

echo "Press ENTER to run make";read aaa
make -j8 | tee comp.out
echo "Press ENTER to run make check";read aaa
make check | tee check.out 2>&1  
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

Libaec Installation

This script intends to install libaec together with HDF5. For a separate installation read more here.

Put this script in the directory, where the libaec-tar file is unpacked.

Adjust the path, (PREFIX), where to install the library.

Adjust compiler and optimisation flags.

module load intel/2022.2
module load openmpi/intel/4.1.4

export COMPILER=intel.22
export CC=mpicc
export CXX=mpicpc
export F77=mpifort
export FC=mpifort
export COMPOPT="-fPIC -O3 -qopt-zmm-usage=high -march=skylake-avx512 -mcmodel=medium -fargument-noalias-global -align -fp-model precise -shared-intel"

#export I_MPI_HYDRA_TOPOLIB=ipl
#export I_MPI_HYDRA_BRANCH_COUNT=-1

export BUILDDIR=build_aec_$COMPILER
mkdir $BUILDDIR
cd $BUILDDIR

parentdir="$(dirname "$(pwd)")"
export BUILDDIR=$parentdir
echo "building in "$BUILDDIR

export PREFIX=/sw/dataformats/hdf5-parallel/ompi/$COMPILER/1.12.2/skl
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa

export LD_RUN_PATH=$LIBRARY_PATH
export CFLAGS="  $COMPOPT -Wl,-rpath=$LD_RUN_PATH"
export CXXFLAGS="$COMPOPT -Wl,-rpath=$LD_RUN_PATH"
export FCFLAGS=" $COMPOPT -Wl,-rpath=$LD_RUN_PATH"
export LDLAGS="-O3 -fPIC"

../libaec-v1.0.6/configure --prefix=$PREFIX 

echo "Press ENTER to run make";read aaa
make -j2
echo "Press ENTER to run make check";read aaa
make check > check.out 2>&1 
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

Libtiff

A software package containing a library for reading and writing _Tag Image File Format_(TIFF), and a small collection of tools for simple manipulations of TIFF images

Description

This software provides support for the Tag Image File Format (TIFF), a widely used format for storing image data.

Read more on libtiff home page. For documentation visit the libtiff documentation page.

Modules

Version	Installation Path	modulefile	compiler
4.0.10	/sw/libraries/libtiff/4.0.10/skl/gcc-8.2	libtiff/4.0.10	gcc.8.2-openmpi.3.1.2

NCO

The NetCdf Operators

Description

The netCDF Operators, or NCO, are a suite of file operators which facilitate manipulation and analysis of self-describing data stored in the (freely available) netCDF and HDF formats.

Vendor	NCO
Installation Path	/sw/dataformats/nco/
4.7.7	2018
4.9.1 (default)	April 2019
5.1.7	July 2023

Man pages are availabe for the binaries (see below).
Info pages can be viewed for the keyword nco. Issue from the command line info nco .
Manuals in several formats can be loaded from the NCO home page.

Binaries

The following binaries are delivered: ncap2 ncatted ncbo ncclimo ncdiff ncea ncecat nces ncflint ncks ncpdq ncra ncrcat ncremap ncrename ncwa

Usage of the NCO tools at HLRN

Module-files and environment variables

To activate the package, issue

module load nco

from the command line.

Installation hints

4.9.1

build with intel/19.0.3, due to an compiler error full optimisation does not work
prerequisites: hdf5, netcdf, antlr, gsl, udunits2
passed through most checks, except expected failures. omp-related checks did not work - possibly due to missing omp environment. Since nco is mostly IO-limited, this should not matter. omp-capabilities are not tested yet.
install script

netCDF

Network Common Data Form

Documentation

NetCDF is a suite of libraries for system and machine independent generation, access and exchange of array oriented scientific data. The NetCDF libraries contain the interface for C, FORTRAN77, FORTRAN90 and C++. The libraries come with some binaries to access, and reformat netcdf-formatted data.

Visit the unidata netcdf web page for detailed documention.

Versions

The NetCDF library is available for several compilers. It is linked dynamically to the threadsafe hdf5- and szip libraries build with the same compiler.

Version	compiler	hdf5	Serial / Parallel	Remarks
4.7.3	gcc/4.5.8	1.10.5	serial	fortran-4.5.2, cxx-4.3.1
4.7.3	gcc/7	1.10.5	serial	fortran-4.5.2, cxx-4.3.1
4.7.3	gcc/8.3.0	1.10.5	serial	fortran-4.5.2, cxx-4.3.1
4.7.3	intel/18.0.6	1.10.5	serial	fortran-4.5.2, cxx-4.3.1
4.7.3	intel/18.0.6 , impi/2018.5	1.10.5 (impi)	parallel	fortran-4.5.2, cxx-4.3.1
4.7.4	gcc/8.3.0	1.12.0	serial	fortran-4.5.3, cxx-4.3.1
4.7.4	gcc/9.2.0, opmi/3.1.5	1.10.6 (ompi)	parallel	fortran-4.5.3, cxx-4.3.1
4.7.4	intel/18.0.6, opmi/3.1.6	1.10.6 (ompi)	parallel	fortran-4.5.3, cxx-4.3.1
4.7.4	intel/19.0.5	1.12.0	serial	fortran-4.5.2, cxx-4.3.1
4.8.1	intel/19.0.5	1.12.1	serial	fortran-4.5.2, cxx-4.3.1
4.9.1	intel/2022.2	1.12.2	serial	fortran-4.6.0, cxx-4.3.1
4.9.1	intel/2022.2, ompi/4.1.4	1.12.2(ompi)	parallel	fortran-4.6.0, cxx-4.3.1
4.9.2	gcc/9.3.0	1.12.2	serial	fortran-4.6.1, cxx-4.3.1

Since the intel compilers an lise and emmy differ, the installation of netcdf build with intel compilers is double work. Hence, installation is done only on one complex. Please contact support, if this does not meet your requests.

Modulefiles and environmental variables

Loading a NetCDF modulefile extends PATH with the path to the NetCDF binaries. This includes nc_config and nf_config, which can be used to gain information on the path to include files and libraries in compiler scripts and makefiles. Use

module show netcdf/<version>
module help netcdf/<version>

to investigate details on exported environment variables.

LD_RUN_PATH is extended when a netcdf-module is loaded. It can be used to define rpath in the binarues.

The netcdf-modules are not needed to run programs linked to a netcdf library!

Example to compile smp-programs using the netcdf library

Here we demonstrate the usage of environmental variables to find the netcdf - include files and to link the netcdf libraries.

Installing netcdf at HLRN

We discuss the installation of serial and parallel netcdf at HLRN

Prerequisites

the installed HDF5 library. NetCDF inherits parallelisation for the HDF5 library

General structure

The netcdf package consists of a C, FORTRAN and C++ part. The FORTRAN and C++ libraries are wrappers to call the C based library.

Basic installation procedure

define the compiler environment
define compiler flags for optimisation
define the path to dependency libraries
run configure to set up the Makefile. The configure script needs several flags.
run make
run make checks
run make install
clean up.

The installation procedure is carried out within a lustre filesystem. This is mandatory for HDF5, since the file system properties are checked during configure. For netCDF it may be not necessary, but not all details of configure are known.

To be continued …

Linking with the netcdf library

Problem

After linking a program with a netcdf library some users report error messages like this:

./nctest: error while loading shared libraries: libnetcdff.so.7: cannot open shared object file: No such file or directory

One my find this error message by searching the web. Hence, one is in good company. Here we present methods to use the tools available at HLRN to produce running binaries. First of all an important statement:

Neither the netcdf nor the hdf5 module have any effect on your binary during runtime. It is useless to load them before running a program.

The following example program can be used for testing only. Don’t ask for a sophisticated science background. The example uses FORTRAN-90 notation.

download the example

To compile the code, the netcdf data- and function types must be included by a use statement. The compiler needs to know, where the fortran-modules are located. Instead of digging into installation details, we use the result of the script nf-config, which is part of the netcdf-suite. To make it available, load a netcdf module:

module load gcc/8.3.0
module load netcdf/gcc.8/4.7.3

Note, compiler version and netcdf version must fit, otherwise the fortran netcdf module cannot be included. We can check this with

nf-config --fc

fortran .mod files from different major gfortran versions are incompatible.

Dynamic linking with the fortran netcdf library

Serial programms

Now you can compile

gfortran  -c -I`nf-config --includedir` test.f90

Instead of digging even more into the installation details, we use the result of nf-config in the link step:

gfortran -o nctest *.o `nf-config --flibs` -lm

To understand the details, just issue the commands nf-config –includedir and nf-config –flibs separately.

Now you may unload both, the compiler and the netcdf module as well. The binary runs anyway, since the path to the netcdf- and hdf5- libraries is stored in the binary. This information was extracted from the environment variable LD_RUN_PATH that is set when loading a netcdf module. Note, the path to the hdf5 libraries linked to the netcdf library is inherited by our binary. The effect of LD_RUN_PATH is broken, if the linker option rpath is used for other purposes. Say, your example program links another library located in /mybla. Checkout

gfortran -o nctest -Wl,-rpath=/mybla *.o `nf-config --flibs` -lm  
ldd nctest

The netcdf-libraries as well as the compiler specific libraries are not found any more. Here we notice, the compiler module also uses LD_RUN_PATH to set the path to the compiler specific libraries. So handle LD_RUN_PATH with care not to disturb this important feature! The correct way would be:

gfortran -o nctest -Wl,-rpath="/mybla:$LD_RUN_PATH" *.o `nf-config --flibs` -lm

Hint, you may use readelf to see more details on the internal binary structure. Watch out for the library sections.

MPI programms

Now we need to use the compiler wrappers to include mpi.h and link the mpi libraries automatically. Here, rpath is used internally, which breaks the automatic propagation of the information of the paths, where the netcdf library resides. This concerns also the location of compiler specific libraries like libgcc_s.so.1 or libquadmath.so.0 . From that there may arise problems, since libquadmath.so.0 coming with gcc.9 contains more symbols than the system version. (This allows usage of the netcdf library build with gcc.8 together with gcc.9 .) Hence, we have to use the rpath option to provide the information on the library path and propagate it to the linker. Load the module files:

module load gcc/9.2.0
module load netcdf/gcc.8/4.7.3
module load openmpi/gcc.9/3.1.5

and check LD_RUN_PATH:

echo $LD_RUN_PATH
/sw/comm/openmpi/3.1.5/skl/gcc/lib:/sw/dataformats/netcdf/gcc.8.3.0/4.7.3/skl/lib:/sw/compiler/gcc/9.2.0/skl/lib64/

Compile:

mpifort  -c -I`nf-config --includedir` test.f90
mpifort -o nctest *.o `nf-config --flibs` -Wl,-rpath=$LD_RUN_PATH -lm

To check the true content of the binary, unload the modules first

module unload openmpi/gcc.9/3.1.5
module unload gcc/9.2.0
module unload netcdf/gcc.8/4.7.3
ldd nctest

With readelf -a more details can be explored:

readelf -a nctest | less

….

Dynamic section at offset 0x2d58 contains 37 entries:

  Tag        Type                         Name/Value
0x0000000000000001 (NEEDED)             Shared library: [libnetcdff.so.7]
0x0000000000000001 (NEEDED)             Shared library: [libnetcdf.so.15]
0x0000000000000001 (NEEDED)             Shared library: [libgfortran.so.5]
0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
0x0000000000000001 (NEEDED)             Shared library: [libmpi_usempif08.so.40]
0x0000000000000001 (NEEDED)             Shared library: [libmpi_usempi_ignore_tkr.so.40]
0x0000000000000001 (NEEDED)             Shared library: [libmpi_mpifh.so.40]
0x0000000000000001 (NEEDED)             Shared library: [libmpi.so.40]
0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
0x0000000000000001 (NEEDED)             Shared library: [libquadmath.so.0]
0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
0x000000000000001d (RUNPATH)            Library runpath: [/sw/comm/openmpi/3.1.5/skl/gcc/lib:/sw/dataformats/netcdf/gcc.8.3.0/4.7.3/skl/lib:/sw/compiler/gcc/9.2.0/skl/lib64/:/sw/tools/hwloc/1.1.13/skl/lib]
...

Static linking with the fortran netcdf library

There is no simple way to find all required libraries, but information on the libraries can be gathered from the file libnetcdff.la made by libtool for dynamic linking.

NETCDF_LIB=`nc-config --libdir`
cat $NETCDF_LIB/libnetcdff.la | grep dependency_libs

We configure the paths to the relevan libraries by hand:

NETCDF_LIB=`nc-config --libdir`
HDF5_LIB=/sw/dataformats/hdf5/1.8.20/skl/gcc.8.2.0.hlrn/lib
SZIP_LIB=/sw/dataformats/szip/2.1/skl/gcc.8.2.0.hlrn/lib

LIBS="$NETCDF_LIB/libnetcdff.a $NETCDF_LIB/libnetcdf.a $HDF5_LIB/libhdf5_hl.a  $HDF5_LIB/libhdf5.a $SZIP_LIB/libsz.a -lpthread -lcurl -lz -lm -ldl"

Now compile and link

gfortran -fPIC -c -I`nf-config --includedir` -Bstatic test.f90
gfortran -o nctest *.o $LIBS

and have fun.

Example netcdf program

program test_netcdf
! -----------------------------------------------------------------------
! The drunken divers testcase
! some lines are stolen from unidata - netcdf testprograms
! -----------------------------------------------------------------------

  use netcdf
  
  implicit none
  
  character*256     :: outputfile='divers_birth_day_drink.nc'
 
  integer :: stdout=6

!  include 'netcdf.inc'
  integer           :: ncid, k
  integer           :: latdim, londim, depthdim, timedim
  integer           :: vardims(4)
  integer           :: latid, lonid, depthid, timeid, ndepth=5
  integer           :: varid
  real              :: depth(5), drinks(1,1,5,1), degeast, degnorth
  real*8            :: rdays
  character (len = *), parameter :: varunit = "glasses"
  character (len = *), parameter :: varname = "number of drinks"
  character (len = *), parameter :: varshort = "drinks" 
  character (len = *), parameter :: units = "units"
  character (len = *), parameter :: long_name = "long_name"
  character (len = *), parameter :: lat_name = "latitude"
  character (len = *), parameter :: lon_name = "longitude"
  character (len = *), parameter :: lat_units = "degrees_north"
  character (len = *), parameter :: lon_units = "degrees_east"
  character (len = *), parameter :: depth_units = "m"
  character (len = *), parameter :: time_units = "days since 2000-01-01 00:00:00"
  character (len = *), parameter :: origin = "time_origin"
  character (len = *), parameter :: origin_val = "1-jan-2000 00:00:00" 
! -----------------------------------------------------------------------
!   define where and when the diver dives and 
!   in which depth he has how much birthday drinks
! -----------------------------------------------------------------------

    degnorth = 57.02
    degeast  = 20.3
    rdays    = 10.0
    do k=1, 5
      depth(k)  = float(k)*float(k)
      drinks(1,1,k,1) = depth(k)
    enddo
 
! -----------------------------------------------------------------------
!   create the file
! -----------------------------------------------------------------------
    
    call check( nf90_create(outputfile, nf90_clobber, ncid))
    write(stdout,*) 'file ',trim(outputfile),' has been created '
! -----------------------------------------------------------------------
!   define axis
! -----------------------------------------------------------------------

    call check( nf90_def_dim(ncid, 'longitude', 1, londim))
    call check( nf90_def_dim(ncid, 'latitude' , 1, latdim))
    call check( nf90_def_dim(ncid, 'depth' ,    ndepth, depthdim))
    call check( nf90_def_dim(ncid, 'time'     , nf90_unlimited, timedim))
    call check( nf90_def_var(ncid, lon_name, nf90_real, londim, lonid))
    call check( nf90_def_var(ncid, lat_name, nf90_real, latdim, latid))
    call check( nf90_def_var(ncid, 'depth',     nf90_real, depthdim, depthid))
    call check( nf90_def_var(ncid, 'time',      nf90_real, timedim, timeid))

    call check( nf90_put_att(ncid, latid, units, lat_units) )
    call check( nf90_put_att(ncid, lonid, units, lon_units) )
    call check( nf90_put_att(ncid, depthid, units, depth_units))
    call check( nf90_put_att(ncid, timeid,  units, time_units))
    call check( nf90_put_att(ncid, timeid,  origin, origin_val))
 
    vardims(1) = londim
    vardims(2) = latdim
    vardims(3) = depthdim
    vardims(4) = timedim
    
! -----------------------------------------------------------------------
!   define variables
! -----------------------------------------------------------------------
    call check( nf90_def_var(ncid, trim(varshort), nf90_real, vardims, varid))
    call check( nf90_put_att(ncid, varid, units ,trim(varunit)))
    call check( nf90_put_att(ncid, varid, long_name, trim(varname)))

    call check( nf90_enddef(ncid))
! -----------------------------------------------------------------------
!   now write something
! -----------------------------------------------------------------------

    call check( nf90_put_var(ncid, latid, degnorth))
    call check( nf90_put_var(ncid, lonid, degeast))
    call check( nf90_put_var(ncid, depthid, depth))
    
    call check( nf90_put_var(ncid, timeid, rdays))
    
    call check( nf90_put_var(ncid, varid, drinks))
!-----------------------------------------------------------------------
!   ready
!-----------------------------------------------------------------------
    call check( nf90_close(ncid))

contains
  subroutine check(status)
    integer, intent ( in) :: status
    
    if(status /= nf90_noerr) then 
      print *, trim(nf90_strerror(status))
      stop "stopped"
    end if
  end subroutine check  
end program test_netcdf

Octave

Description

Prerequisites

Modules

Example Jobscripts

#!/bin/bash
#SBATCH -p
 
module load

pigz

A parallel implementation of gzip for modern multi-processor, multi-core machine

Description

pigz is a fully functional replacement for gzip that exploits multiple processors and multiple cores to the hilt when compressing data.

Read more on pigz home page. For User Manual visit the pigz documentation.

Modules

Version	Installation Path	modulefile	compiler	comment
2.4	/sw/tools/pigz/2.4/skl	pigz/2.4	gcc.8.2.0	Gö
2.4	/sw/tools/pigz/2.4/skl	pigz/2.4	gcc.9.2.0	B

PROJ

Cartographic Projections Library

General Information

Vendor: USGS
Installation Path: /sw/dataformats/proj/< version >

Version	build date	compiler	remark
6.2.1	04/2019	intel-18
7.1.0	08/2020	gcc-8
9.2.1	04/2022	gcc-9	default

A library is delivered for map projections of gridded data, used for example in CDO.

Additional binaries are available: cs2cs, geod, invgeod, invproj, nad2bin, proj

When a proj-module is loaded man-pages for cs2cs, geod, proj, pj_init are available.

Versions, modulefiles and environment variables

Type

module avail proj

for a list of available versions.

The module sets the path to the binaries.

R

R - statistical computing and graphics

Description

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.

R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity". …

Prerequisites

For the installation of R-packages by users with the help of rstudio or Rscript, the appropriate compiler module must be loaded in addition to the R-module.

R at HLRN

Modules

Before starting R, load a modulefile

module load R/version

This provides access to the script R that sets up an environment and starts the R-binary. The corresponding man - and info pages become available.

Info pages: R-admin, R-data, R-exts, R-intro, R-lang, R-admin, R-FAQ, R-ints

As programming environment, rstudio Version 1.1.453 is installed and available, when a module file for R is loaded. rstudio starts the version of R specified with the module file. Running R on the frontends

This is possible, but resources and runtime are limited. Be friendly to other users and work on the shared compute nodes!

Running R on the compute nodes

Allocate capacity in the batch system, and log onto the related node:

$ salloc -N 1 -p large96:shared
$ squeue --job <jobID>

The output of salloc shows your job ID. With squeue you see the node you are going to use. Login with X11-forwarding:

$ ssh -X <nodename>

Load a module file and work interactively as usual. When ready, free the resources:

$ scancel <jobID>

You may also use srun:

$ srun -v -p large96:shared --pty --interactive bash

Do not forget to free the resources when ready.

R packages

List of installed R packages

The following packages are installed by default, when a new version of R is build. Please contact support to extend this list.

Package list for 3.5.1

Users may request package installation via support or install in their HOME - directory.

Building R-packages - users approach

Users may install their own packages in the HOME-directory from the rstudio gui or using Rscript. R-packages must be build with the same compiler as R itself was build, see the table above. This happens, when Rscript is used and the appropriate compiler module is loaded.

Building R-packages - administrators approach

R administrators may use rstudio or Rscript for installation. For installing packages in /sw/viz/R it is suggested, to use Rscript like

$ Rscript -e 'install.packages("'$package'",repos="'$REPOSITORY'",INSTALL_opts="--html")'

Using INSTALL_opts="–html" keeps documentation of installed packages up to date!

This becomes rapidly work intensive, when installing a huge bundle of packages or even the package set for a new R release. For convenience, we maintain a list of default packages and scripts to install them all. These are located in the installation directory:

install_packages,
install_cran
install_github
install_bioc
remove_package,
sync_wiki

Here also the workarounds are collected needed to install stiff packages, whose developers do not care and do not support all Rscript options.

The R-project homepage

Installing R and R-packages

Installing a new R release and a bundle of default user packages is streamlined with some scripts. This text is for administrators of the R installation. Other users may find some hints here, if their installation to $HOME fails.

Installing R

Make a new directory. It is strongly suggested, to follow the form /sw/viz/R/R-4.0.2/skl. Copy the installation scripts into this directory. Do not forget, other members of the sw-group may need to install here too.
```
$ cd /sw/viz/R/R-4.0.2/skl
$ cp ../../scripts/* .
$ chmod g+rwX *
```
Edit the script getR and update the version number. Run getR. This downloads the requested version of R, inflates und unpacks the tar-file and renames the source directory into build. Note, if you download otherwise, please rename the R-directory into build, if you want to use the script install_R.
```
$ ./get_R
```
Check or edit the script install_R. You may change there:
- the default compiler to be used for building R and R packages. This compiler needs to be compatible with the compiler used for external packages like netcdf, openmpi, magick etc. If you change the default compiler, please change the other scripts too - see below.
- the options for running configure
- compiler options, especially the degreee of optimisation and rpath. These options will be part of the default settings to be used by Rscript when building packages.
Run install_R. The script will stop several times requesting ENTER to continue. This helps to see, if the configuration, compilation and all checks are finished with reasonable output. Note, the script changes permissions in the end, to allow other members of the sw-group to install there too.
```
$ ./install_R
```
Produce a module file. It should be sufficient, to copy that of the previous version and to change the version number in the file.

Single R-package installation

Before a package is installed, the correct compiler module must be loaded. Otherwise the system compiler is used. Since c++ is used a lot, this may result in an inconsistent package library. It is not needed, to load a R-module file, when the scripts described below are used.

Package installation may be done with the help of Rscript or directly from R or rstudio. We recommend to use our scripts instead, since

they load the default compiler compatible with R
contain fixes and workarounds needed to install some packages
set the permissions for the installed files, which is often forgotten
help with bookkeeping of package lists.

The scripts produce a one line script executed immediately by Rscript. It employs mainly the R-libraries install.packages (not to mix with our script with the same name!) or githubinstall.

We support three different repositories:

cran ( https://cran.uni-muenster.de/ ): the main repository for releases of R package
```
$ ./install_cran <packagename>
```
github: the detailed search for the package is done with githubinstall used by R.
```
$ ./install_github <packagename>
```
BiocManager ( https://www.bioconductor.org/install ): a specific repository providing tools for the analysis and comprehension of high-throughput genomic data.
```
$ ./install_bioc <packagename> <version>
```
The version is optional.

If the installation is successful, the scripts prompt to ask for adding the package into a list of packages that will be installed by default with a new R version. In this case, one of the package lists in /sw/viz/R/package_list is updated, a copy of the previous list is stored in /sw/viz/R/package_list_save. These lists can be edited by hand.

To remove a package, you may use

$ ./remove_package <packagename>

Automatic removal from the list of default packages is not implemented yet.

Default R-packages

To install the bundle of predefined R-packages, use the script install_packages. It

loads the appropriate compiler module
employs install_cran, install_github and install_bioc to install the packages
sets the correct permissions for the installed files and libraries
documents success or failure of the installation

$ ./install_packages

The lists are located in /sw/viz/R/package_list. They can be maintained with any editor. They are the result of user requests in the past. They are over-complete, since many packages are installed in any case as prerequisite for other packages. Success or failure are reported in packages_success.log and packages_failed.log.

Using install_packages is coffee consuming. The script needs several hours to install the full list of packages. Some prerequisites are installed several times. Known issues

The installation of some packages fails and is reported in packages_failed.log. Repeating the installation of the single packages helps in all cases, the reason is unknown. Search in /sw/viz/R/package_list, which install source is the appropriate one. Using github instead of cran may result in a long lasting version update of all prerequisites. However, no package failure has been reported anyway.

Rmpi is linked with openmpi. The intel compilers and impi are not tested yet.

Some packages do not set rpath properly, do not accept libraries at non-default places or do not consider the flags configure.vars and configure.args used by the R-package install.packages. In this case, there are limited means to propagate information on include paths or rpath to the package installation process. One way to do this, is to put the appropriate compiler flags in ~/.R/Makevars. The scripts do not overwrite this file, but save a copy fist that is restored in the end of the package installation.

Szip

Szip, fast and lossless compression of scientific data

Documentation

Szip is a freeware portable general purpose lossless compression program. It has a high speed and compression, but high memory demands too.

The Szip library is now replaced by the aec library.

Using Szip compression in HDF5: Szip is a stand-alone library that is configured as an optional filter in HDF5. Depending on which Szip library is used (encoder enabled or decode-only), an HDF5 application can create, write, and read datasets compressed with Szip compression, or can only read datasets compressed with Szip.

Applications use Szip by setting Szip as an optional filter when a dataset is created. If the Szip encoder is enabled with the HDF5 library, data is automatically compressed and decompressed with Szip during I/O. If only the decoder is present, the HDF5 library cannot create and write Szip-compressed datasets, but it automatically decompresses Szip-compressed data when data is read.

Download the code from HDF Group.

Versions

Version 2.2.1 is installed for all relevant compilers. Find the library in /sw/dataformats/szip/2.1.1/skl. Note the license restriction.

License

Szip may be used for scientific purposes in conjunction with HDF data handling. read more.

Modules

There is no module file yet.

Building

The szip - libraries are build with autotools. High optimisation is enabled, all tests are passed. Please see the file run_configure ind the build -directory.

UDUNITS2

Unidata UDUNITS2 Package, Conversion and manipulation of units

Description

Conversion of unit specifications between formatted and binary forms, arithmetic manipulation of units, and conversion of values between compatible scales of measurement. The Udunits2 package supports units of physical quantities (e.g., meters, seconds). Specifically, it supports conversion between string and binary representations of units, arithmetic manipulation of units, and conversion of numeric values between compatible units. Udunits2 is used by several other packages at HLRN.

Vendor: Unidata Installation Path: /sw/dataformats/udunits/

Version	compiler
2.2.26	gcc-7
2.2.26	gcc-8
2.2.26	intel
2.2.28	gcc-9
2.2.28	intel.22

The udunits home page.
If an udunits module is loaded an info page is available for the keywords udunits2, udunits2lib and udunits2prog.

Modules

To activate the package, issue module load udunits from the command line or put this line into your login shell. For more versions see module avail udunits

Examples

To activate udunits type

module load udunits/2.1.24_intel

Direct calls of the udunits binary will be of minor importance. After loading the module, one may try

udunits
You have: J
You want: cal   
    <cal> = <J>*0.238846
    <cal> = <J>/4.1868

ECCODES

ECMWF application programming interface

Description

ecCodes is a package developed by ECMWF which provides an application programming interface and a set of tools for decoding and encoding messages in the following formats:

WMO FM-92 GRIB edition 1 and edition 2
WMO FM-94 BUFR edition 3 and edition 4
WMO GTS abbreviated header (only decoding).

A useful set of command line tools provide quick access to the messages. C, Fortran 90 and Python interfaces provide access to the main ecCodes functionality.

ecCodes is an evolution of GRIB-API. It is designed to provide the user with a simple set of functions to access data from several formats with a key/value approach.

Version	Build Date	Installation Path	modulefile	compiler	libraries
2.12.0	12. apr.2019	/sw/dataformats/eccodes	yes	intel.18, intel.19	hdf5 1.10.5, netcdf 4.6.3
2.31.0	21. jul. 2023	/sw/dataformats/eccodes	yes	intel.22, gcc.9	hdf5 1.12.2, netcdf 4.9.1

Documentation

Detailed description is found at the ECCODES home page,
use option –help to see an overview of relevant options

Usage at HLRN

Modulefiles and environmental variables

load a module to activate the path to the binaries and set some other environment variables. Use module show for more details.
link the eccodes-library for processing GRIB and BUFR formatted data

Program start

The following binaries are provided:

HLRN specific installation

run_cmake script for installation with cmake

Installing ECCODES

gcc intel

Install ecCodes with intel compilers

module load intel/2022.2
export CC=icc
export FC=ifort

module load cmake/3.26.4
module load netcdf/intel/4.9.1
 
mkdir build ; cd build
echo `pwd`
parentdir="$(dirname "$(pwd)")"
export PREFIX=$parentdir
echo "builing for "$PREFIX
echo "Press ENTER to run cmake";read aaa

export HDF5=/sw/dataformats/hdf5/intel.22/1.12.2/skl
#export HDF5_LIBRARIES=$HDF5/lib 
export HDF5_LIBRARIES="$HDF5/lib/libhdf5.a $HDF5/lib/libhdf5_hl.a"
export HDF5_INCLUDE_DIRS=$HDF5/include
export PATH=$HDF5:$PATH

export NETCDF=`nc-config --prefix`

FFLAGS="-O3 -fPIC -xCORE-AVX512 -qopt-zmm-usage=high"
CFLAGS="-O3 -fPIC -xCORE-AVX512 -qopt-zmm-usage=high" 

export CC=icc
export FC=ifort
export CXX=icpc

cmake \
  -DCMAKE_C_COMPILER="$CC" -DCMAKE_Fortran_COMPILER="$FC" \
  -DCMAKE_C_FLAGS="$CFLAGS" -DCMAKE_Fortran_FLAGS="$FFLAGS" \
  -DCMAKE_CXX_COMPILER="$CXX" -DCMAKE_CXX_FLAGS="$CFLAGS" \
  -DBUILD_SHARED_LIBS=BOTH \
  -DENABLE_MEMFS=ON \
  -DENABLE_PNG=ON \
  -DENABLE_JPG=ON \
  -DENABLE_AEC=ON -DAEC_DIR=/sw/dataformats/aec/1.0.6/skl/intel.19 \
  -DENABLE_FORTRAN=ON \
  -DENABLE_NETCDF=ON \
  -DENABLE_ECCODES_OMP_THREADS=ON \
  -DNETCDF_PATH=$NETCDF \
  -DENABLE_INSTALL_ECCODES_DEFINITIONS=ON \
  -DENABLE_INSTALL_ECCODES_SAMPLES=ON \
  -DCMAKE_INSTALL_PREFIX="$PREFIX" ../src

echo "Press ENTER to run make";read aaa
make -j8
echo "Press ENTER to run ctest";read aaa
ctest
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

Install ecCodes with gcc

module load gcc/9.3.0

module load cmake/3.26.4
module load netcdf/gcc.9/4.9.2
 
export COMPILER=gcc.9.3.0

rm -r build; mkdir build ; cd build
echo `pwd`
parentdir="$(dirname "$(pwd)")"
export PREFIX=$parentdir
echo "builing for "$PREFIX
echo "Press ENTER to run cmake";read aaa

export HDF5=/sw/dataformats/hdf5/$COMPILER/1.12.2/skl
#export HDF5_LIBRARIES=$HDF5/lib 
export HDF5_LIBRARIES="$HDF5/lib/libhdf5.a $HDF5/lib/libhdf5_hl.a"
export HDF5_INCLUDE_DIRS=$HDF5/include
export PATH=$HDF5:$PATH

export NETCDF=`nc-config --prefix`

FFLAGS="-O3 -fPIC -march=skylake-avx512 -Wl,-rpath=$LD_RUN_PATH"
CFLAGS="-O3 -fPIC -march=skylake-avx512 -Wl,-rpath=$LD_RUN_PATH" 

export CC=gcc
export CXX=g++
export FC=gfortran

cmake \
  -DCMAKE_C_COMPILER="$CC" -DCMAKE_Fortran_COMPILER="$FC" \
  -DCMAKE_C_FLAGS="$CFLAGS" -DCMAKE_Fortran_FLAGS="$FFLAGS" \
  -DCMAKE_CXX_COMPILER="$CXX" -DCMAKE_CXX_FLAGS="$CFLAGS" \
  -DBUILD_SHARED_LIBS=BOTH \
  -DENABLE_MEMFS=ON \
  -DENABLE_PNG=ON \
  -DENABLE_JPG=ON \
  -DENABLE_AEC=ON -DAEC_DIR=/sw/dataformats/aec/1.0.6/skl/intel.19 \
  -DENABLE_FORTRAN=ON \
  -DENABLE_NETCDF=ON \
  -DENABLE_ECCODES_OMP_THREADS=ON \
  -DNETCDF_PATH=$NETCDF \
  -DENABLE_INSTALL_ECCODES_DEFINITIONS=ON \
  -DENABLE_INSTALL_ECCODES_SAMPLES=ON \
  -DECCODES_INSTALL_EXTRA_TOOLS=ON \
  -DENABLE_ECCODES_OMP_THREADS=ON \
  -DCMAKE_INSTALL_PREFIX="$PREFIX" ../src

echo "Press ENTER to run make";read aaa
make -j8
echo "Press ENTER to run ctest";read aaa
ctest
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

Engineering

Abaqus — A Finite Element Analysis Package for Engineering Application
How to bring your own license
STAR-CCM+ — A Package for Computational Fluid Dynamics Simulations
ParaView — An interactive data analysis and visualisation tool with 3D rendering capability

Abaqus

Warning

Our Abaqus license ran out on April 30th, 2024. You are only be able to resume working with Abaqus products if you can bring your own license (see How to bring your own license). Alternatively, you might consider using other Finite Element Analysis (FEA) tools such as Mechanical or LS-DYNA from Ansys.

A Finite Element Analysis Package for Engineering Application

To see our provided versions type: module avail abaqus

ABAQUS 2019 is the default. ABAQUS 2018 is the first version with multi-node support. ABAQUS 2016 is the last version including Abaqus/CFD.

Conditions for Usage and Licensing

Access to and usage of the software is regionally limited:

Only users from Berlin (account names “be*”) can use the ZIB license on NHR@ZIB systems. This license is strictly limited to teaching and academic research for non-industry funded projects only. Usually, there are always sufficient licenses for Abaqus/Standard and Abaqus/Explicit command-line based jobs. You can check this yourself (just in case):
```
# on NHR@ZIB systems
lmutil lmstat -S -c 1700@10.241.101.140 | grep -e "ABAQUSLM:" -e "Users of abaqus" -e "Users of parallel" -e "Users of cae"
```
Users from other german states can use the software installed on HLRN but have to use their own license from their own license server (see How to bring your own license).

Example Jobscripts

The input file of the test case (Large Displacement Analysis of a linear beam in a plane) is: c2.inp

Distributed Memory Parallel Processing

#!/bin/bash
#SBATCH -t 00:10:00
#SBATCH --nodes=2 
#SBATCH --ntasks-per-node=48
#SBATCH -p standard96:test
#SBATCH --mail-type=ALL
#SBATCH --job-name=abaqus.c2
 
module load abaqus/2020
 
# host list:
echo "SLURM_NODELIST:  $SLURM_NODELIST"
create_abaqus_hostlist_for_slurm
# This command will create the file abaqus_v6.env for you.
# If abaqus_v6.env exists already in the case folder, it will append the line with the hostlist.
 
### ABAQUS parallel execution
abq2019 analysis job=c2 cpus=${SLURM_NTASKS} standard_parallel=all mp_mode=mpi interactive double
 
echo '#################### ABAQUS finished ############'

SLURM logs to: slurm-<your job id>.out

The log of the solver is written to: c2.msg

Warning

Single Node Processing

#!/bin/bash
#SBATCH -t 00:10:00
#SBATCH --nodes=1  ## 2016 and 2017 do not run on more than one node
#SBATCH --ntasks-per-node=96
#SBATCH -p standard96:test
#SBATCH --job-name=abaqus.c2
 
module load abaqus/2016
 
# host list:
echo "SLURM_NODELIST:  $SLURM_NODELIST"
create_abaqus_hostlist_for_slurm
# This command will create the file abaqus_v6.env for you.
# If abaqus_v6.env exists already in the case folder, it will append the line with the hostlist.
 
### ABAQUS parallel execution
abq2016 analysis job=c2 cpus=${SLURM_NTASKS} standard_parallel=all mp_mode=mpi interactive double
 
echo '#################### ABAQUS finished ############'

Abaqus CAE GUI - not recommended for supercomputer use!

#SBATCH -L cae

to your job script. This ensures that the SLURM scheduler starts your job only if a CAE license is available.

srun -p standard96:test -L cae --x11 --pty bash
 
# wait for node allocation (a single node is the default), then run the following on the compute node
 
module load abaqus/2022
abaqus cae -mesa

How to bring your own license

For some preinstalled software products HLRN offers a limited number of licenses only. In case you do not want to queue for a certain license you can bring your own. This requires the following steps:

Your license server is accessible: from 130.73.234.140 (Berlin) or 134.76.1.14 (Göttingen) Eventually, you need to ask you local admin to allow external access.
Send a mail to nhr-support@gwdg.de including the name (FQDN), IP and ports of your license server. Please configure the ports of your license server statically (see remark). Please let us know if you will use Berlin, Göttingen or both systems. In case your license server is inside a subnet, we also need to know the name/IP of its access point.
We will setup IP forwarding rules, such that your job script can call your license server from our compute and post-processing nodes.

Remark

FLEXlm license servers usually use two TCP ports (“SERVER” and “VENDOR”), and both of them need to be configured in the HLRN license gateway. By default, the VENDOR port is dynamic (i.e. chosen randomly) and may change on restart of the license server. To remain accessible from the HLRN, both SERVER and VENDOR ports need to be statically configured (VENDOR xyz port=1234) in the license server. See also http://www.hlynes.com/?p=278

STAR-CCM+

A Package for Computational Fluid Dynamics Simulations

General Information

Producer: Siemens PLM Software (formerly CD-adapco Group)

Note

This documentation describes the specifics of installation and usage of STAR-CCM+ at HLRN. Introductory courses for STAR-CCM+ as well as courses for special topics are offered by CD-adapco and their regional offices, e.g. in Germany. It is strongly recommended to take at least an introductory course (please contact Siemens PLM Software).

Modules

The following tables lists installed STAR-CCM+ versions.

Version	Module File	Remarks
14.02.012-R8	starccm/12.04.012-r8	double precision version
14.04.011-R8	starccm/14.04.011-r8	double precision version

Note

The module name is starccm. Other versions may be installed. Inspect the output of :
module avail starccm

Functionality

STAR-CCM+ is a powerful finite-volume-based program package for modelling of fluid flow problems. (The name STAR stands for “Simulation of Turbulent flow in Arbitrary Regions”.) The STAR-CCM+ package can be applied to a wide range of problems such as

Aerospace Engineering
Turbomachinery
Chemical Process Engineering
Automotive Engineering
Building and Environment Engineering

Conditions for Usage and Licensing

All usage of STAR-CCM+ products at HLRN is strictly limited to teaching and academic research for non-industry funded projects only.

In order to run STAR-CCM+ on HLRN-IV, you have to specify the parameters -licpath and -podkey, as shown in the example script below. Users with their own licenses can specify the parameters to point to their own licenses.

Note

To use STAR-CCM+ you need to mail nhr-support@gwdg.de and ask to become a member of the UNIX group adapco. In the same email you may apply for a Power On Demand (POD) license key by stating the estimated amount of wallclock time.

Details of the HLRN Installation of STAR-CCM+

STAR-CCM+ is installed below /sw/eng/starccm/. We provide module files which make all environment settings for the use of a specific STAR-CCM+ version.

STAR-CCM+ products come with complete documentation. The User Guide is available in PDF format, see directory /sw/eng/starccm/<version>/STAR-CCM+<version>/doc.

Example Jobscripts

#!/bin/bash
#SBATCH -t 00:10:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=40
#SBATCH -p medium
#SBATCH --mail-type=ALL
#SBATCH --job-name=StarCCM
 
module load starccm/14.04.011-r8
 
## create the host list for starccm+
srun hostname -s | sort | uniq -c | awk '{ print $2":"$1 }' > starhosts.${SLURM_JOB_ID}
 
export CDLMD_LICENSE_FILE=<port@licenseserver>
export PODKEY=<type your podkey here>
export MYCASE=<type your sim file name>
 
## run starccm+
starccm+ -dp -np ${SLURM_NTASKS} -batch ${MYCASE} \
 -power -podkey ${PODKEY} -licpath ${CDLMD_LICENSE_FILE} \
 -machinefile starhosts.${SLURM_JOB_ID} -mpi intel
 
echo '#################### StarCCM+ finished ############'
rm starhosts.$SLURM_JOB_ID

Note

Despite the fact that -machinefile starhosts.$SLURM_JOB_ID is used, you have to specify the number of worker processes (-np).

Tutorial Cases for STAR-CCM+

Tutorial case files can be found in /sw/eng/starccm/<version>/STAR-CCM+<version>/doc/startutorialsdata resp. (with solutions) in /sw/eng/starccm/<version>/STAR-CCM+<version>/tutorials, verification data in /sw/eng/starccm/<version>/STAR-CCM+<version>/VerificationData.

Exciting

Description

G0W0 approximation;
Solution to the Bethe-Salpeter equation (BSE), to compute optical properties;
Time-dependent DFT (TDDFT) in both frequency and time domains;
Density-functional perturbation theory for lattice vibrations.

exciting is an open-source code, released under the GPL license.

More information is found on the official website: https://exciting-code.org/

Modules

exciting is currently available only on Lise. The standard species files deployed with exciting are located in $EXCITING_SPECIES. If you wish to use a different set, please refer to the manual.

The exciting module depends on impi/2021.7.1.

Example Jobscripts

#!/bin/bash
#SBATCH --time 12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=exciting
  
module load impi/2021.7.1
# Load exciting neon
# If you want to use fluorine, replace with exciting/009-fluorine
module load exciting/010-neon
  
# Set the number of OpenMP threads as given by the SLURM parameter "cpus-per-task"
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
   
# Adjust the maximum stack size of OpenMP threads
export OMP_STACKSIZE=512m
  
# Do not use the CPU binding provided by slurm
export SLURM_CPU_BIND=none
   
# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close
   
# Binding MPI tasks
export I_MPI_PIN=yes
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN_CELL=core
  
mpirun exciting

FFTW3

A C-subroutine library for computing discrete Fourier transforms

Description

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions,

Read more on the fftw3 home page For a manual consult the online manual or download the pdf

Versions

Version	Build Date	Installation Path	modulefile	compiler
Fftw3 /3.3.7	unknown	/cm/shared/apps/fftw/openmpi	fftw3/openmpi/gcc/64/3.3.7	gcc
Fftw3 /3.3.8	01-feb-2019	/sw/numerics/fftw3/gcc.7.5.0/3.3.8/		gcc/7.5.0
Fftw3 /3.3.8	01-feb-2019	/sw/numerics/fftw3/gcc.8.3.0/3.3.8/		gcc/8.3.0
Fftw3 /3.3.8	16-feb-2020	/sw/numerics/fftw3/ompi/gcc.9.2.0/3.3.8/	fftw3/ompi/gcc/3.3.8	gcc/9.2.0, openmpi/gcc.9/3.1.5
Fftw3 /3.3.8	01-feb-2019	/sw/numerics/fftw3/impi/gcc.9.2.0l/3.3.8/	fftw3/impi/gcc/3.3.8	gcc/9.2.0, impi/2018.5
Fftw3 /3.3.8	01-feb-2019	/sw/numerics/fftw3/impi/intel/3.3.8/	fftw3/impi/intel/3.3.8	intel/19.0.1 impi/2018.5

The single, long-double, omp and threads -enabled version is installed. The ompi and the impi - installation contain the serial libraries too. Both, the shared and the static version are available.

For fftw3 with INTEL compilers, please consider also to use the MKL! Read more

Modules and Usage at HLRN

The library is included in several software packages. A module file gives access to a few binaries, fftwf-wisdom, fftwl-wisdom. fftw-wisdom. fftw-wisdom-to-conf.

module show fftw3/version
module help fftw3/version

deliver details on paths and environmental variables.

module load fftw3/version

defines environmental variables:

PATH
LD_RUN_PATH
PKG_CONFIG_PATH
LIBRARY_PATH

The modules do not have any effect during runtime.

Precision

You may link to -lfftw3f (single) or -lfftw3l (long-double) instead of or in addition to -lfftw3 (double).

You can see all provided versions directly after you loaded a FFTW module with: ls ${LIBRARY_PATH%%:*}

Installation at HLRN

Fftw3 is build from source. The current version is build with several compilers. High end optimisation is used. Read more

All libraries passed through the basic checks.

The Fftw3-Installation at HLRN

Download:

fftw3 - downloads

Installation path:

/sw/numerics/fftw3/< mpi-version >/<compiler-version>/3.3.8/skl, untar and rename/move the directory to build

configure:

configure --help reveals the most important switches:

CC, FC, MPICC etc. to define the compiler
CFLAGS as slot for compiler options, do not forget -Wl,-rpath=$LD_RUN_PATH to burn the path to compiler libraries. intel/18.0.3 does not have the LD_RUN_PATH. The path the the fftw3-objects is burned in automatically by configure/make.
–enable-shared to build alos shared libraries
–enable-single, –enable-long-double to build for different numerical accuracy.
–enable-omp, –enable-threads, –enable-mpi

GPAW

Description

GPAW documentation and other material can be found on the GPAW website.

The GPAW project is licensed under GNU GPLv3.

Prerequisites

Only members of the gpaw user group have access to GPAW installations provided by HLRN. To have their user ID included in this group, users can send a message to their consultant or to HLRN support.

Modules

GPAW version	GPAW modulefile	GPAW requirements
20.1.0	`gpaw/20.1.0 (Lise only)`	`anaconda3/2019.10`, `ase/3.19.1`, `impi/2019.*`

When a gpaw module has been loaded successfully, the command gpaw info can be used to show supported features of this GPAW installation.

Job Script Examples

For Intel Cascade Lake compute nodes – simple case of a GPAW job with 192 MPI tasks distributed over 2 nodes running 96 tasks each (Berlin only)

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -N 2
#SBATCH --tasks-per-node 96
 
module load anaconda3/2019.10
module load ase/3.19.1
module load impi/2019.9
module load gpaw/20.1.0
 
export SLURM_CPU_BIND=none
 
mpirun gpaw python myscript.py

GROMACS

Description

Strengths

GROMACS provides extremely high performance compared to all other programs.
GROMACS can make simultaneous use of both CPU and GPU available in a system. There are options to statically and dynamically balance the load between the different resources.
GROMACS is user-friendly, with topologies and parameter files written in clear text format.
Both run input files and trajectories are independent of hardware endian-ness, and can thus be read by any version GROMACS.
GROMACS comes with a large selection of flexible tools for trajectory analysis.
GROMACS can be run in parallel, using the standard MPI communication protocol.
GROMACS contains several state-of-the-art algorithms.
GROMACS is Free Software, available under the GNU Lesser General Public License (LGPL).

Weaknesses

GROMACS does not do to much further analysis to get very high simulation speed.
Sometimes it is challenging to get non-standard information about the simulated system.
Different versions sometimes have differences in default parameters/methods. Reproducing older version simulations with a newer version can be difficult.
Additional tools and utilities provided by GROMACS are sometimes not the top quality.

GPU support

GROMACS automatically uses any available GPUs. To achieve the best performance GROMACS uses both GPUs and CPUs in a reasonable balance.

QuickStart

Environment modules

The following versions have been installed:

Modules for running on CPUs

Version	Installation Path	modulefile	compiler	comment
2018.4	/sw/chem/gromacs/2018.4/skl/impi	gromacs/2018.4	intelmpi
2018.4	/sw/chem/gromacs/2018.4/skl/impi-plumed	gromacs/2018.4-plumed	intelmpi	with plumed
2019.6	/sw/chem/gromacs/2019.6/skl/impi	gromacs/2019.6	intelmpi
2019.6	/sw/chem/gromacs/2019.6/skl/impi-plumed	gromacs/2019.6-plumed	intelmpi	with plumed
2021.2	/sw/chem/gromacs/2021.2/skl/impi	gromacs/2021.2	intelmpi
2021.2	/sw/chem/gromacs/2021.2/skl/impi-plumed	gromacs/2021.2-plumed	intelmpi	with plumed
2022.5	/sw/chem/gromacs/2022.5/skl/impi	gromacs/2022.5	intelmpi
2022.5	/sw/chem/gromacs/2022.5/skl/impi-plumed	gromacs/2022.5-plumed	intelmpi	with plumed

Modules for running on GPUs

Version	Installation Path	modulefile	compiler	comment
2023.0	/sw/chem/gromacs/2023.0/a100/tmpi_gcc	gromacs/2023.0_tmpi

*Release notes can be found here.

These modules can be loaded by using a module load command. Note that Intel MPI module file should be loaded first:

module load impi/2019.5 gromacs/2019.6

This provides access to the binary gmx_mpi which can be used to run simulations with sub-commands as gmx_mpi mdrun

In order to run simulations MPI runner should be used:

mpirun gmx_mpi mdrun MDRUNARGUMENTS

In order to load the GPU enabled version (avaiable only on the bgn nodes):

Modules can be loaded by using a module load command. Note that the following module files should be loaded first:

module load gcc/11.3.0 intel/2023.0.0 cuda/11.8 gromacs/2023.0_tmpi

Submission script examples

Simple CPU job script

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -n 640
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load gromacs/2019.6
 
mpirun gmx_mpi mdrun MDRUNARGUMENTS

Whole node CPU job script

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -N 7
#SBATCH --tasks-per-node 96
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load gromacs/2019.6
 
mpirun gmx_mpi mdrun MDRUNARGUMENTS

GPU job script

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --partition=gpu-a100
#SBATCH --ntasks=72
 
export SLURM_CPU_BIND=none
 
module load gcc/11.3.0 intel/2023.0.0 cuda/11.8
module load gromacs/2023.0_tmpi
 
export GMX_GPU_DD_COMMS=true
export GMX_GPU_PME_PP_COMMS=true
 
OMP_NUM_THREADS=9
 
gmx mdrun -ntomp 9 -ntmpi 4 -nb gpu -pme gpu -npme 1 -gputasks 0001 OTHER MDRUNARGUMENTS

Whole node GPU job script

To setup a whole node GPU job use the -gputasks keyword.

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --partition=gpu-a100
#SBATCH --ntasks=72
 
export SLURM_CPU_BIND=none
 
module load gcc/11.3.0 intel/2023.0.0 cuda/11.8
module load gromacs/2023.0_tmpi
 
export GMX_GPU_DD_COMMS=true
export GMX_GPU_PME_PP_COMMS=true
 
OMP_NUM_THREADS=9
 
gmx mdrun -ntomp 9 -ntmpi 16 -gputasks 0000111122223333 MDRUNARGUMENTS

Gromacs-Plumed

Gromacs/20XX.X-plumed modules are versions have been patched with PLUMED’s modifications, and these versions are able to run meta-dynamics simulations.

Analyzing results

GROMACS Tools

VMD

Python

Usage tips

System preparation

Running simulations

mpirun gmx_mpi mdrun MDRUNARGUMENTS -maxh 24

Restarting simulations

mpirun gmx_mpi mdrun MDRUNARGUMENTS -cpi filename.cpt

More detailed information can be find here.

Performance

More information about performance of the simulations and “how to imporve perfomance” can be find here.

Special Performance Instructions for Emmy at GWDG

Useful links

References

GSL

The GNU Scientific Library (GSL)- a numerical library for C and C++ programmers

Documentation

The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. It is free software under the GNU General Public License.

Read more on the GSL home page.

Versions

Version	Build Date	Installation Path	modulefile	compiler
GSL /2.5	05-feb-2019	/sw/numerics/gsl/2.5	gsl/gcc.8/2.5	gcc/8.2.0
GSL /2.5	05-feb-2019	/sw/numerics/gsl/2.5	gsl/2.5_intel.18	intel/18.0.5
GSL /2.5	05-may-2019	/sw/numerics/gsl/2.5/	gsl/2.5_intel.19	intel/19.0.3
GSL/2.7	17-jun–2021	/sw/numerics/gsl/*/2.7/	gsl/gcc.8/2.7	gcc/8.3.0

In addition, the libgslcbls and FORTRAN- interface FGSL (1.2.0) are provided. Both, the shared and the static version are available.

For a manual consult the online manual or download the pdf. For the FORTRAN interface see FGSL wikibook.

Modules and usage of GSL at HLRN

The library is included in several software packages. A module file gives access to a few binaries, gsl-config, gsl-histogram, gsl-randist.

module load gsl/version

sets environmental variables:

LD_RUN_PATH
PATH
MANPATH
INFOPATH
PKG_CONFIG_PATH

module help gsl/version

delivers the path to the libraries. Sophisticated tools and programmers use gsl-config of pkg-config.

Installation at HLRN

GSL is build from source. The current version is build with several compilers. High end optimisation is used.All libraries passed through the basic checks. With high end optimoisation with intel compilers the linear algebra solvers do not converge. fp-model strict is used. read more The GNU Scientific Library (GSL)- a numerical library for C and C++ programmers

Documentation

The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. It is free software under the GNU General Public License.

Read more on the GSL home page.

Versions

Version	Build Date	Installation Path	modulefile	compiler
GSL /2.5	05-feb-2019	/sw/numerics/gsl/2.5	gsl/2.5_gcc.8.2	gcc/8.2.0
GSL /2.5	05-feb-2019	/sw/numerics/gsl/2.5	gsl/2.5_intel.18	intel/18.0.5
GSL /2.5	05-may-2019	/sw/numerics/gsl/2.5/	gsl/2.5_intel.19	intel/19.0.3

In addition, the libgslcbls and FORTRAN- interface FGSL (1.2.0) are provided. Both, the shared and the static version are available.

For a manual consult the online manual or download the pdf. For the FORTRAN interface see FGSL wikibook.

Modules and usage of GSL at HLRN

The library is included in several software packages. A module file gives access to a few binaries, gsl-config, gsl-histogram, gsl-randist.

module load gsl/version

sets environmental variables:

LD_RUN_PATH
PATH
MANPATH
INFOPATH
PKG_CONFIG_PATH

module help gsl/version

delivers the path to the libraries. Sophisticated tools and programmers use gsl-config of pkg-config. Installation at HLRN

Configure and make GSL

parentdir="$(dirname "$(pwd)")"
export PREFIX=$parentdir
echo "builing for "$PREFIX

module load intel/compiler/64/2019/19.0.1 
module load intel/tbb/64/2019/1.144  
module load intel/mkl/64/2019/1.144

export FC=ifort
export CC=icc
export CXX=icpc
export LD_RUN_PATH=$LD_LIBRARY_PATH

#export CFLAGS="-fPIC -O3 -Wl,-rpath=$LD_RUN_PATH"
export CFLAGS="-fPIC -O3 -fp-model strict -Wl,-rpath=$LD_RUN_PATH"
./configure --prefix=$PREFIX

echo "Press ENTER to compile"; read ttt
make -j4
echo "Press ENTER to check"; read ttt
make check
echo "Press ENTER to install"; read ttt
make install
echo "Press ENTER to clean "; read ttt
make clean

HDF5 Libraries and Binaries

HDF5 - hierarchical data format

Documentation

Installed versions

We cannot support all combinations of hdf5, compiler and mpi. If none of the installed version works, please contact support for installation of missing versions.

Serial HDF-5

Version	Compiler	Module	API
1.10.5	intel/18.0.6	hdf5/intel/1.10.5	v110
1.10.5	gcc/7.5.0	hdf5/gcc.7/1.10.5	v110
1.10.5	gcc/8.3.0	hdf5/gcc.8/1.10.5	v110
1.10.6	gcc/8.3.0	hdf5/gcc.8/1.10.6	v110
1.10.7	gcc/9.3.0	hdf5/gcc.9/1.10.7	v110
1.12.1	intel/19.0.5	hdf5/intel/1.12.1	v112
1.12.2	gcc/8.3.0	hdf5/gcc.8/1.12.2	v112
1.12.2	gcc/9.3.0	hdf5/gcc.9/1.12.2	v112
1.12.2	intel/2022.2	hdf5/intel/1.12.2	v112

Parallel HDF-5

Version	Compiler, MPI	Module	API
1.10.5	intel/18.0.6, impi/2018.5	hdf5-parallel/impi/intel/1.10.5	v110
1.12.0	intel/19.1.2, impi/2019.8	hdf5-parallel/impi/intel/1.12.0	v112
1.10.6	intel/18.0.6, openmpi/intel/3.1.6	hdf5-parallel/ompi/intel/1.10.6	v110
1.10.5	gcc/8.3.0, openmpi/gcc.9/3.1.5	hdf5-parallel/ompi/gcc.8/1.10.5	v110
1.10.6	gcc/9.2.0, openmpi/gcc.9/3.1.5	hdf5-parallel/ompi/gcc.9/1.10.6	v110
1.12.1	gcc/8.3.0, openmpi/gcc.9/3.1.5	hdf5-parallel/ompi/gcc.8/1.12.0	v112

The libraries are threadsafe and can be used in omp-parallel codes. To see configure-details on the installed libraries, load a hdf5-module and issue cat $HDF5DIR/libhdf5.settings!

Modules and Prerequisites

module avail hdf5 shows all available hdf5 versions
module show <modulename> shows environment variables set when loading the module
module help <modulename> shows some details on compilers and MPI

PATH, LD_RUN_PATH

C_INCLUDE_PATH, CPLUS_INCLUDE_PATH,

HDF5DIR, HDF5INCLUDE, HDF5LIB

HDF5_HOME, HDF5_ROOT

The module files do not require any prerequisites. All dynamically linked libraries are found, since RPATH is defined in libraries and binaries. For an example, consider

readelf -a $HDF5DIR/libhdf5_hl.so | grep RPATH

Linking the hdf5-libraries in user programs

hdf5 compiler wrapper

The compiler wrapper h5cc, h5fc (parallel h5pcc and h5pfc) can be used to build programs linking the hdf5-libraries. These wrappers set rpath for all libraries to be linked dynamically for hdf5.

Installing the hdf5-libraries at HLRN

HDF5 Installation

This Page describes, how it was done, not how it should be done.

Remark

Prerequisites

the szip library. In recent versions the szip library is replaced by the aec library. It emulates szip.1.10.6
for parallel builds intel-MPI or openMPI.
all other ingredients (zlib) are part of the system.

Install libaec

enable for configure to avoid cmake ( read more )
install libaec to the same path, where HDF5 will reside. (script) The preinstalled linaec can also be used.

configure flags for version 1.10.

–with-pic
–enable-production
–enable-unsupported –enable-threadsafe
–enable-fortran –enable-fortran2003 –enable-cxx

configure flags for version 1.12.

–with-pic
–enable-production –enable-optimization=high
–enable-direct-vfd –enable-preadwrite
–enable-unsupported –enable-threadsafe
–enable-fortran –enable-fortran2003 –enable-cxx
–enable-file-locking –enable-recursive-rw-locks

Script for installing parallel HDF-5 1.12.2
Script for installing parallel HDF-5 1.12.2

Enable autotools

After unpacking the archive file, go to the new directory and issue the commands:

libtoolize --force
aclocal
autoheader
automake --force-missing --add-missing
autoconf

HDF5 1.12.2 configuration

Features:

                     Parallel HDF5: yes
  Parallel Filtered Dataset Writes: yes
                Large Parallel I/O: yes
                High-level library: yes
Dimension scales w/ new references: no
                  Build HDF5 Tests: yes
                  Build HDF5 Tools: yes
                      Threadsafety: yes
               Default API mapping: v112
    With deprecated public symbols: yes
            I/O filters (external): deflate(zlib),szip(encoder)
                               MPE: 
                     Map (H5M) API: no
                        Direct VFD: yes
                        Mirror VFD: yes
                (Read-Only) S3 VFD: no
              (Read-Only) HDFS VFD: no
                           dmalloc: no
    Packages w/ extra debug output: none
                       API tracing: no
              Using memory checker: no
   Memory allocation sanity checks: no
            Function stack tracing: no
                  Use file locking: yes
         Strict file format checks: no
      Optimization instrumentation: no
      
Press ENTER to run make

HDF5 1.12.2 parallel (impi) installation

export VER=1.12.2
rm -r hdf5-$VER
zcat hdf5-$VER.tar.gz | tar -xvf -

echo "unpacked hdf5-"$VER
echo "Press ENTER to continue";read aaa

module load intel/2022.2
module load impi/2021.6

export COMPILER=intel.22
export CC=mpiicc
export CXX=mpiicpc
export F77=mpiifort
export FC=mpiifort

export COMPOPT="-fPIC -O3 -qopt-zmm-usage=high -march=skylake-avx512 -xskylake-avx512 -mtune=skylake-avx512 -mcmodel=medium -fargument-noalias-global -fp-model precise -shared-inte
l"
export COMPOPT="-fPIC -O2"

export SLURM_CPU_BIND=none   
export I_MPI_HYDRA_TOPOLIB=ipl
export I_MPI_HYDRA_BRANCH_COUNT=-1
export I_MPI_EXTRA_FILESYSTEM=1
export I_MPI_EXTRA_FILESYSTEM_FORCE=lustre

export BUILDDIR=hdf5-$VER
cd $BUILDDIR

parentdir="$(dirname "$(pwd)")"
echo "building in "$BUILDDIR

export PREFIX=/sw/dataformats/hdf5-parallel/impi.21/$COMPILER/$VER/skl
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa
export SZIP=$PREFIX
export LD_RUN_PATH=$PREFIX/lib:$LIBRARY_PATH

export CFLAGS="  $COMPOPT -align -Wl,-rpath=$LD_RUN_PATH"
export CXXFLAGS="$COMPOPT -align -Wl,-rpath=$LD_RUN_PATH"
export FCFLAGS=" $COMPOPT -assume nostd_value -align array64byte -Wl,-rpath=$LD_RUN_PATH"
#export LDLAGS="-O3 -fPIC"

#Set to true to make sure taht it is really switched off
export HDF5_USE_FILE_LOCKING=FALSE

../hdf5-$VER/configure --prefix=$PREFIX --with-szlib=$SZIP --with-pic \
         --enable-build-mode=production \
         --enable-parallel \
         --disable-file-locking
         --enable-direct-vfd \
#         --enable-file-locking --enable-recursive-rw-locks
#         --enable-direct-vfd --enable-mirror-vfd \
#          --enable-optimization=high \
#         --enable-threadsafe --enable-unsupported \

echo "Press ENTER to run make";read aaa
make -j8 | tee comp.out
echo "Press ENTER to run make check";read aaa
make -i check | tee check.out 2>&1  
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

HDF5 1.12.2 parallel (ompi) installation

module load intel/2022.2
module load openmpi/intel/4.1.4

export VER=1.12.2

export COMPILER=intel.22
export CC=mpicc
export CXX=mpicxx
export F77=mpifort
export FC=mpifort
export SLURM_CPU_BIND=none   
#export I_MPI_HYDRA_TOPOLIB=ipl
#export I_MPI_HYDRA_BRANCH_COUNT=-1
#export I_MPI_FABRICS=ofi
#export I_MPI_SHM_CELL_BWD_SIZE=2048000
 
export COMPOPT="-fPIC -O3 -qopt-zmm-usage=high -march=skylake-avx512 -xskylake-avx512 -mtune=skylake-avx512 -mcmodel=medium -fargument-noalias-global -fp-model precise -shared-intel"

export BUILDDIR=build_hdf5_ompi_$COMPILER
mkdir $BUILDDIR
cd $BUILDDIR

parentdir="$(dirname "$(pwd)")"
echo "building in "$BUILDDIR

export PREFIX=/sw/dataformats/hdf5-parallel/ompi/$COMPILER/$VER/skl
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa
export SZIP=$PREFIX
export LD_RUN_PATH=$PREFIX/lib:$LIBRARY_PATH

export CFLAGS="  $COMPOPT -align -Wl,-rpath=$LD_RUN_PATH"
export CXXFLAGS="$COMPOPT -align -Wl,-rpath=$LD_RUN_PATH"
export FCFLAGS=" $COMPOPT -align array64byte -Wl,-rpath=$LD_RUN_PATH"
#export LDLAGS="-O3 -fPIC"

#Set to true to make sure that it is really switched off
export HDF5_USE_FILE_LOCKING="FALSE"

../hdf5-$VER/configure --prefix=$PREFIX --with-szlib=$SZIP --with-pic \
         --enable-build-mode=production \
         --enable-optimization=high \
         --enable-parallel \
         --enable-threadsafe --enable-unsupported \
         --enable-direct-vfd --enable-mirror-vfd \
         --enable-fortran --enable-cxx \
         --disable-file-locking

#         --enable-file-locking --enable-recursive-rw-locks \

echo "Press ENTER to run make";read aaa
make -j8 | tee comp.out
echo "Press ENTER to run make check";read aaa
make check | tee check.out 2>&1  
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

Libaec Installation

This script intends to install libaec together with HDF5. For a separate installation read more here.

Put this script in the directory, where the libaec-tar file is unpacked.

Adjust the path, (PREFIX), where to install the library.

Adjust compiler and optimisation flags.

module load intel/2022.2
module load openmpi/intel/4.1.4

export COMPILER=intel.22
export CC=mpicc
export CXX=mpicpc
export F77=mpifort
export FC=mpifort
export COMPOPT="-fPIC -O3 -qopt-zmm-usage=high -march=skylake-avx512 -mcmodel=medium -fargument-noalias-global -align -fp-model precise -shared-intel"

#export I_MPI_HYDRA_TOPOLIB=ipl
#export I_MPI_HYDRA_BRANCH_COUNT=-1

export BUILDDIR=build_aec_$COMPILER
mkdir $BUILDDIR
cd $BUILDDIR

parentdir="$(dirname "$(pwd)")"
export BUILDDIR=$parentdir
echo "building in "$BUILDDIR

export PREFIX=/sw/dataformats/hdf5-parallel/ompi/$COMPILER/1.12.2/skl
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa

export LD_RUN_PATH=$LIBRARY_PATH
export CFLAGS="  $COMPOPT -Wl,-rpath=$LD_RUN_PATH"
export CXXFLAGS="$COMPOPT -Wl,-rpath=$LD_RUN_PATH"
export FCFLAGS=" $COMPOPT -Wl,-rpath=$LD_RUN_PATH"
export LDLAGS="-O3 -fPIC"

../libaec-v1.0.6/configure --prefix=$PREFIX 

echo "Press ENTER to run make";read aaa
make -j2
echo "Press ENTER to run make check";read aaa
make check > check.out 2>&1 
echo "Press ENTER to run make install";read aaa
make install
echo "Do not forget to run make clean!"

How to bring your own license

Your license server is accessible: from 130.73.234.140 (Berlin) or 134.76.1.14 (Göttingen) Eventually, you need to ask you local admin to allow external access.
Send a mail to nhr-support@gwdg.de including the name (FQDN), IP and ports of your license server. Please configure the ports of your license server statically (see remark). Please let us know if you will use Berlin, Göttingen or both systems. In case your license server is inside a subnet, we also need to know the name/IP of its access point.
We will setup IP forwarding rules, such that your job script can call your license server from our compute and post-processing nodes.

Remark

libcurl

curl - a tool for transferring data from or to a server

General Information

The curl library provides basic communication with other servers and clients and is usually part of the system. The HLRN system does not meet the requirements of some software and a separate curl-library is provided.

Read more on the vendor pages.

Version	Compiler	Build date	Installation Path	modulefile
8.2.0	gcc-9	25-jul-2023	/sw/libraries/curl/	curl/gcc.9/8.2.0
8.2.0	intel-22	28.-jul-2023		curl/intel.22/8.2.0

For some more information consult the man-page.

Usage at HLRN

Load the modulefile

$ module load curl

This provides access to the script curl-config that can be used to explore details of the library installation. Also the binary curl bcomes available. The environment variable LD_RUN_PATH is set to make linking with rpath easy.

Building curl

Installation includes

download from github
run make and make install - see run_configure in the installation path.

Libtiff

A software package containing a library for reading and writing _Tag Image File Format_(TIFF), and a small collection of tools for simple manipulations of TIFF images

Description

This software provides support for the Tag Image File Format (TIFF), a widely used format for storing image data.

Read more on libtiff home page. For documentation visit the libtiff documentation page.

Modules

Version	Installation Path	modulefile	compiler
4.0.10	/sw/libraries/libtiff/4.0.10/skl/gcc-8.2	libtiff/4.0.10	gcc.8.2-openmpi.3.1.2

libz

A Massively Spiffy Yet Delicately Unobtrusive Compression Library

General Information

The HLRN system does not meet the requirements of some software and a separate zlib-library is provided.

Read more on the vendor pages.

Version	Compiler	Build date	Installation Path	modulefile
1.2.11	gcc-8	22-jan-2020	/sw/tools/zlib/	zlib/gcc.8/1.2.11
1.2.13	gcc-9	28-jul-2023	/sw/tools/zlib/	zlib/gcc.8/1.2.13
1.2.13	intel-22	28.-jul-2023	/sw/tools/zlib/	zlib/intel.22/1.2.13

For some more information consult the man-page.

Usage at HLRN

Load the modulefile

$ module load zlib

This provides access to the script pkg-config that can be used to explore details of the library installation. The environment variable LD_RUN_PATH is set to make linking with rpath easy.

Building zlib

Installation includes

download from web page
load a compiler module file
run configure. Use –prefix=<install_path>.
run make, make test and make install.

Miscellaneous

libcurl — curl - a tool for transferring data from or to a server
libz — A Massively Spiffy Yet Delicately Unobtrusive Compression Library
nocache — nocache - minimize caching effects in lustre filesystems
texlive – LaTeX distribution, typesetting system
git – A fast, scalable, distributed revision control system

libcurl

curl - a tool for transferring data from or to a server

General Information

Read more on the vendor pages.

Version	Compiler	Build date	Installation Path	modulefile
8.2.0	gcc-9	25-jul-2023	/sw/libraries/curl/	curl/gcc.9/8.2.0
8.2.0	intel-22	28.-jul-2023		curl/intel.22/8.2.0

For some more information consult the man-page.

Usage at HLRN

Load the modulefile

$ module load curl

Building curl

Installation includes

download from github
run make and make install - see run_configure in the installation path.

libz

A Massively Spiffy Yet Delicately Unobtrusive Compression Library

General Information

The HLRN system does not meet the requirements of some software and a separate zlib-library is provided.

Read more on the vendor pages.

Version	Compiler	Build date	Installation Path	modulefile
1.2.11	gcc-8	22-jan-2020	/sw/tools/zlib/	zlib/gcc.8/1.2.11
1.2.13	gcc-9	28-jul-2023	/sw/tools/zlib/	zlib/gcc.8/1.2.13
1.2.13	intel-22	28.-jul-2023	/sw/tools/zlib/	zlib/intel.22/1.2.13

For some more information consult the man-page.

Usage at HLRN

Load the modulefile

$ module load zlib

This provides access to the script pkg-config that can be used to explore details of the library installation. The environment variable LD_RUN_PATH is set to make linking with rpath easy.

Building zlib

Installation includes

download from web page
load a compiler module file
run configure. Use –prefix=<install_path>.
run make, make test and make install.

nocache

nocache - minimize caching effects in lustre filesystems

General Information

The nocache tool tries to minimize the effect an application has on the Linux file system cache. This is done by intercepting the open and close system calls and calling posix_fadvise with the POSIX_FADV_DONTNEED parameter. Because the library remembers which pages (ie., 4K-blocks of the file) were already in file system cache when the file was opened, these will not be marked as “don’t need”, because other applications might need that, although they are not actively used (think: hot standby).

Use case: backup processes that should not interfere with the present state of the cache.

Use case: staging of large amount of data in a lustre file system before a parallel job

Usage at HLRN

nocache is found to resolve the lustre issue, where temporarly invalid files are produced by staging huge amount of data before a parallel job step.

Load the modulefile

$ module load nocache

This provides access to the script nocache and the binaries cachedel and cachestats. The corresponding man - pages become available.

Prepend nocache before file copy operations

$ nocache cp <source> <target>

Building nocache

Installation includes

download from github
run make and make install - see run_make in the installation path.

License conditions

See https://github.com/Feh/nocache/blob/master/COPYING

MUMPS

MUltifrontal Massively Parallel sparse direct Solver.

Description

MUMPS is a numerical software package for solving sparse systems of linear equations with many features.

Read more on MUMPS home page. For manual and user-guide visit the MUMPS User-guide page.

Modules

Version	Installation Path	modulefile	compiler	comment
5.2.1	/sw/libraries/mumps/5.2.1/skl	mumps/5.2.1	openmpi.3.1.2-gcc.8.2.0
5.2.1	/sw/libraries/mumps/5.2.1/skl	mumps/5.2.1	openmpi.3.1.5-gcc.9.2.0

NAMD

Description

NAMD current documentation and other material can be found on the NAMD website.

Prerequisites

NAMD is distributed free of charge for non-commercial purposes only. Users need to agree to the NAMD license. This includes proper citation of the code in publications.

Only members of the namd user group have access to NAMD executables provided by HLRN. To have their user ID included in this group, users can send a message to their consultant or to NHR support.

Modules

NAMD is a parallel application. It is recommended to use mpirun as the job starter for NAMD at HLRN. An MPI module providing the mpirun command needs to be loaded ahead of the NAMD module.

NAMD version	NAMD modulefile	NAMD requirements
2.13	namd/2.13	impi/* (any version)

File I/O Considerations

Job Script Examples

For Intel Skylake compute nodes (Göttingen only) – simple case of a NAMD job using a total of 200 CPU cores distributed over 5 nodes running 40 tasks each

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p medium40
#SBATCH -N 5
#SBATCH --tasks-per-node 40
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load namd/2.13
 
mpirun namd2 inputfile > outputfile

For Intel Cascade Lake compute nodes – simple case of a NAMD job using a total of 960 CPU cores distributed over 10 nodes running 96 tasks each

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -N 10
#SBATCH --tasks-per-node 96
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load namd/2.13
 
mpirun namd2 inputfile > outputfile

A set of input files for a small and short replica-exchange simulation is included with the NAMD installation. A description can be found in the NAMD User’s Guide. The following job script executes this replica-exchange simulation on 2 nodes using 8 replicas (24 tasks per replica)

#!/bin/bash
#SBATCH -t 0:20:00
#SBATCH -p standard96
#SBATCH -N 2
#SBATCH --tasks-per-node 96
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load namd/2.13
 
cp -r /sw/chem/namd/2.13/skl/lib/replica .
cd replica/example/
mkdir output
(cd output; mkdir 0 1 2 3 4 5 6 7)
 
mpirun namd2 +replicas 8 job0.conf +stdout output/%d/job0.%d.log

NCO

The NetCdf Operators

Description

The netCDF Operators, or NCO, are a suite of file operators which facilitate manipulation and analysis of self-describing data stored in the (freely available) netCDF and HDF formats.

Vendor	NCO
Installation Path	/sw/dataformats/nco/
4.7.7	2018
4.9.1 (default)	April 2019
5.1.7	July 2023

Man pages are availabe for the binaries (see below).
Info pages can be viewed for the keyword nco. Issue from the command line info nco .
Manuals in several formats can be loaded from the NCO home page.

Binaries

The following binaries are delivered: ncap2 ncatted ncbo ncclimo ncdiff ncea ncecat nces ncflint ncks ncpdq ncra ncrcat ncremap ncrename ncwa

Usage of the NCO tools at HLRN

Module-files and environment variables

To activate the package, issue

module load nco

from the command line.

Installation hints

4.9.1

build with intel/19.0.3, due to an compiler error full optimisation does not work
prerequisites: hdf5, netcdf, antlr, gsl, udunits2
passed through most checks, except expected failures. omp-related checks did not work - possibly due to missing omp environment. Since nco is mostly IO-limited, this should not matter. omp-capabilities are not tested yet.
install script

netCDF

Network Common Data Form

Documentation

Visit the unidata netcdf web page for detailed documention.

Versions

The NetCDF library is available for several compilers. It is linked dynamically to the threadsafe hdf5- and szip libraries build with the same compiler.

Version	compiler	hdf5	Serial / Parallel	Remarks
4.7.3	gcc/4.5.8	1.10.5	serial	fortran-4.5.2, cxx-4.3.1
4.7.3	gcc/7	1.10.5	serial	fortran-4.5.2, cxx-4.3.1
4.7.3	gcc/8.3.0	1.10.5	serial	fortran-4.5.2, cxx-4.3.1
4.7.3	intel/18.0.6	1.10.5	serial	fortran-4.5.2, cxx-4.3.1
4.7.3	intel/18.0.6 , impi/2018.5	1.10.5 (impi)	parallel	fortran-4.5.2, cxx-4.3.1
4.7.4	gcc/8.3.0	1.12.0	serial	fortran-4.5.3, cxx-4.3.1
4.7.4	gcc/9.2.0, opmi/3.1.5	1.10.6 (ompi)	parallel	fortran-4.5.3, cxx-4.3.1
4.7.4	intel/18.0.6, opmi/3.1.6	1.10.6 (ompi)	parallel	fortran-4.5.3, cxx-4.3.1
4.7.4	intel/19.0.5	1.12.0	serial	fortran-4.5.2, cxx-4.3.1
4.8.1	intel/19.0.5	1.12.1	serial	fortran-4.5.2, cxx-4.3.1
4.9.1	intel/2022.2	1.12.2	serial	fortran-4.6.0, cxx-4.3.1
4.9.1	intel/2022.2, ompi/4.1.4	1.12.2(ompi)	parallel	fortran-4.6.0, cxx-4.3.1
4.9.2	gcc/9.3.0	1.12.2	serial	fortran-4.6.1, cxx-4.3.1

Modulefiles and environmental variables

module show netcdf/<version>
module help netcdf/<version>

to investigate details on exported environment variables.

LD_RUN_PATH is extended when a netcdf-module is loaded. It can be used to define rpath in the binarues.

The netcdf-modules are not needed to run programs linked to a netcdf library!

Example to compile smp-programs using the netcdf library

Here we demonstrate the usage of environmental variables to find the netcdf - include files and to link the netcdf libraries.

Installing netcdf at HLRN

We discuss the installation of serial and parallel netcdf at HLRN

Prerequisites

the installed HDF5 library. NetCDF inherits parallelisation for the HDF5 library

General structure

The netcdf package consists of a C, FORTRAN and C++ part. The FORTRAN and C++ libraries are wrappers to call the C based library.

Basic installation procedure

define the compiler environment
define compiler flags for optimisation
define the path to dependency libraries
run configure to set up the Makefile. The configure script needs several flags.
run make
run make checks
run make install
clean up.

To be continued …

Linking with the netcdf library

Problem

After linking a program with a netcdf library some users report error messages like this:

./nctest: error while loading shared libraries: libnetcdff.so.7: cannot open shared object file: No such file or directory

Neither the netcdf nor the hdf5 module have any effect on your binary during runtime. It is useless to load them before running a program.

The following example program can be used for testing only. Don’t ask for a sophisticated science background. The example uses FORTRAN-90 notation.

download the example

module load gcc/8.3.0
module load netcdf/gcc.8/4.7.3

Note, compiler version and netcdf version must fit, otherwise the fortran netcdf module cannot be included. We can check this with

nf-config --fc

fortran .mod files from different major gfortran versions are incompatible.

Dynamic linking with the fortran netcdf library

Serial programms

Now you can compile

gfortran  -c -I`nf-config --includedir` test.f90

Instead of digging even more into the installation details, we use the result of nf-config in the link step:

gfortran -o nctest *.o `nf-config --flibs` -lm

To understand the details, just issue the commands nf-config –includedir and nf-config –flibs separately.

gfortran -o nctest -Wl,-rpath=/mybla *.o `nf-config --flibs` -lm  
ldd nctest

gfortran -o nctest -Wl,-rpath="/mybla:$LD_RUN_PATH" *.o `nf-config --flibs` -lm

Hint, you may use readelf to see more details on the internal binary structure. Watch out for the library sections.

MPI programms

module load gcc/9.2.0
module load netcdf/gcc.8/4.7.3
module load openmpi/gcc.9/3.1.5

and check LD_RUN_PATH:

echo $LD_RUN_PATH
/sw/comm/openmpi/3.1.5/skl/gcc/lib:/sw/dataformats/netcdf/gcc.8.3.0/4.7.3/skl/lib:/sw/compiler/gcc/9.2.0/skl/lib64/

Compile:

mpifort  -c -I`nf-config --includedir` test.f90
mpifort -o nctest *.o `nf-config --flibs` -Wl,-rpath=$LD_RUN_PATH -lm

To check the true content of the binary, unload the modules first

module unload openmpi/gcc.9/3.1.5
module unload gcc/9.2.0
module unload netcdf/gcc.8/4.7.3
ldd nctest

With readelf -a more details can be explored:

readelf -a nctest | less

….

Dynamic section at offset 0x2d58 contains 37 entries:

  Tag        Type                         Name/Value
0x0000000000000001 (NEEDED)             Shared library: [libnetcdff.so.7]
0x0000000000000001 (NEEDED)             Shared library: [libnetcdf.so.15]
0x0000000000000001 (NEEDED)             Shared library: [libgfortran.so.5]
0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
0x0000000000000001 (NEEDED)             Shared library: [libmpi_usempif08.so.40]
0x0000000000000001 (NEEDED)             Shared library: [libmpi_usempi_ignore_tkr.so.40]
0x0000000000000001 (NEEDED)             Shared library: [libmpi_mpifh.so.40]
0x0000000000000001 (NEEDED)             Shared library: [libmpi.so.40]
0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
0x0000000000000001 (NEEDED)             Shared library: [libquadmath.so.0]
0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
0x000000000000001d (RUNPATH)            Library runpath: [/sw/comm/openmpi/3.1.5/skl/gcc/lib:/sw/dataformats/netcdf/gcc.8.3.0/4.7.3/skl/lib:/sw/compiler/gcc/9.2.0/skl/lib64/:/sw/tools/hwloc/1.1.13/skl/lib]
...

Static linking with the fortran netcdf library

There is no simple way to find all required libraries, but information on the libraries can be gathered from the file libnetcdff.la made by libtool for dynamic linking.

NETCDF_LIB=`nc-config --libdir`
cat $NETCDF_LIB/libnetcdff.la | grep dependency_libs

We configure the paths to the relevan libraries by hand:

NETCDF_LIB=`nc-config --libdir`
HDF5_LIB=/sw/dataformats/hdf5/1.8.20/skl/gcc.8.2.0.hlrn/lib
SZIP_LIB=/sw/dataformats/szip/2.1/skl/gcc.8.2.0.hlrn/lib

LIBS="$NETCDF_LIB/libnetcdff.a $NETCDF_LIB/libnetcdf.a $HDF5_LIB/libhdf5_hl.a  $HDF5_LIB/libhdf5.a $SZIP_LIB/libsz.a -lpthread -lcurl -lz -lm -ldl"

Now compile and link

gfortran -fPIC -c -I`nf-config --includedir` -Bstatic test.f90
gfortran -o nctest *.o $LIBS

and have fun.

Example netcdf program

program test_netcdf
! -----------------------------------------------------------------------
! The drunken divers testcase
! some lines are stolen from unidata - netcdf testprograms
! -----------------------------------------------------------------------

  use netcdf
  
  implicit none
  
  character*256     :: outputfile='divers_birth_day_drink.nc'
 
  integer :: stdout=6

!  include 'netcdf.inc'
  integer           :: ncid, k
  integer           :: latdim, londim, depthdim, timedim
  integer           :: vardims(4)
  integer           :: latid, lonid, depthid, timeid, ndepth=5
  integer           :: varid
  real              :: depth(5), drinks(1,1,5,1), degeast, degnorth
  real*8            :: rdays
  character (len = *), parameter :: varunit = "glasses"
  character (len = *), parameter :: varname = "number of drinks"
  character (len = *), parameter :: varshort = "drinks" 
  character (len = *), parameter :: units = "units"
  character (len = *), parameter :: long_name = "long_name"
  character (len = *), parameter :: lat_name = "latitude"
  character (len = *), parameter :: lon_name = "longitude"
  character (len = *), parameter :: lat_units = "degrees_north"
  character (len = *), parameter :: lon_units = "degrees_east"
  character (len = *), parameter :: depth_units = "m"
  character (len = *), parameter :: time_units = "days since 2000-01-01 00:00:00"
  character (len = *), parameter :: origin = "time_origin"
  character (len = *), parameter :: origin_val = "1-jan-2000 00:00:00" 
! -----------------------------------------------------------------------
!   define where and when the diver dives and 
!   in which depth he has how much birthday drinks
! -----------------------------------------------------------------------

    degnorth = 57.02
    degeast  = 20.3
    rdays    = 10.0
    do k=1, 5
      depth(k)  = float(k)*float(k)
      drinks(1,1,k,1) = depth(k)
    enddo
 
! -----------------------------------------------------------------------
!   create the file
! -----------------------------------------------------------------------
    
    call check( nf90_create(outputfile, nf90_clobber, ncid))
    write(stdout,*) 'file ',trim(outputfile),' has been created '
! -----------------------------------------------------------------------
!   define axis
! -----------------------------------------------------------------------

    call check( nf90_def_dim(ncid, 'longitude', 1, londim))
    call check( nf90_def_dim(ncid, 'latitude' , 1, latdim))
    call check( nf90_def_dim(ncid, 'depth' ,    ndepth, depthdim))
    call check( nf90_def_dim(ncid, 'time'     , nf90_unlimited, timedim))
    call check( nf90_def_var(ncid, lon_name, nf90_real, londim, lonid))
    call check( nf90_def_var(ncid, lat_name, nf90_real, latdim, latid))
    call check( nf90_def_var(ncid, 'depth',     nf90_real, depthdim, depthid))
    call check( nf90_def_var(ncid, 'time',      nf90_real, timedim, timeid))

    call check( nf90_put_att(ncid, latid, units, lat_units) )
    call check( nf90_put_att(ncid, lonid, units, lon_units) )
    call check( nf90_put_att(ncid, depthid, units, depth_units))
    call check( nf90_put_att(ncid, timeid,  units, time_units))
    call check( nf90_put_att(ncid, timeid,  origin, origin_val))
 
    vardims(1) = londim
    vardims(2) = latdim
    vardims(3) = depthdim
    vardims(4) = timedim
    
! -----------------------------------------------------------------------
!   define variables
! -----------------------------------------------------------------------
    call check( nf90_def_var(ncid, trim(varshort), nf90_real, vardims, varid))
    call check( nf90_put_att(ncid, varid, units ,trim(varunit)))
    call check( nf90_put_att(ncid, varid, long_name, trim(varname)))

    call check( nf90_enddef(ncid))
! -----------------------------------------------------------------------
!   now write something
! -----------------------------------------------------------------------

    call check( nf90_put_var(ncid, latid, degnorth))
    call check( nf90_put_var(ncid, lonid, degeast))
    call check( nf90_put_var(ncid, depthid, depth))
    
    call check( nf90_put_var(ncid, timeid, rdays))
    
    call check( nf90_put_var(ncid, varid, drinks))
!-----------------------------------------------------------------------
!   ready
!-----------------------------------------------------------------------
    call check( nf90_close(ncid))

contains
  subroutine check(status)
    integer, intent ( in) :: status
    
    if(status /= nf90_noerr) then 
      print *, trim(nf90_strerror(status))
      stop "stopped"
    end if
  end subroutine check  
end program test_netcdf

NFFT

Discrete Fourier transform (DFT) in one or more dimensions

General Information

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions,

Read more on the nfft home page For a manual consult the online manual or download the pdf

Versions

Version	Build Date	Installation Path	modulefile	compiler
3.5.1	20-feb-2020	/sw/numerics/nfft/gcc.8.3.0/3.5.1/		gcc/8.3.0
3.5.1	20-feb-2020	/sw/numerics/nfft/gcc.9.2.0/3.5.1/	nfft/gcc/3.5.1	gcc/9.2.0
3.5.1	20-feb-2020	/sw/numerics/nfft/intel.19/3.5.1/	nfft/intel/3.5.1	intel/19.0.1

All types of interface are installed. Both, the shared and the static version are available.

Usage of Nfft at HLRN

$ module load nfft/version

It sets:

LD_RUN_PATH
PKG_CONFIG_PATH

It load the related fftw3-module.

$ module show nfft/version

delivers the path to the libraries.

Installation at HLRN

Nfft is build from source. The current version is build with several compilers. High end optimisation is used. read more Description

Building NFFT

Intel

module load intel/19.0.5
module load fftw3/intel/3.3.8
export CC=icc
export CXX=icpc
export F77=ifort
export COMPARCH="-xCORE-AVX512 -qopt-zmm-usage=high"
parentdir="$(dirname "$(pwd)")"
export BUILDDIR=$parentdir
echo "building in "$BUILDDIR
export PREFIX=$BUILDDIR
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa
export CFLAGS=" -fPIC -O3 $COMPARCH -Wl,-rpath=$LD_RUN_PATH" 
export CXXFLAGS="-fPIC -O3 $COMPARCH -Wl,-rpath=$LD_RUN_PATH" 
export LIBS="-L/sw/numerics/fftw3/intel/3.3.8/skl/lib/" 
./configure --prefix=$PREFIX --enable-all --enable-openmp \ 
  --with-fftw3=/sw/numerics/fftw3/intel/3.3.8/skl \ 
  --with-fftw3-libdir=/sw/numerics/fftw3/intel/3.3.8/skl/lib \ 
  --with-fftw3-includedir=/sw/numerics/fftw3/intel/3.3.8/skl/include 

echo "Press ENTER to compile"; read ttt 
make -j4 
make check 
echo "Press ENTER to install"; read ttt 
make install 
echo "Do not forget to make clean"

GCC

module load gcc/9.2.0
module load fftw3/gcc.9/3.3.8
export CC=gcc
export CXX=g++
export F77=gfortran
export COMPARCH="-march=skylake-avx512"

parentdir="$(dirname "$(pwd)")"
export BUILDDIR=$parentdir
echo "building in "$BUILDDIR

export PREFIX=$BUILDDIR
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa

export CFLAGS="  -fPIC -O3 $COMPARCH -Wl,-rpath=$LD_RUN_PATH"
export CXXFLAGS="-fPIC -O3 $COMPARCH -Wl,-rpath=$LD_RUN_PATH"
export LIBS="-L/sw/numerics/fftw3/gcc.9.2.0/3.3.8/skl/lib/"

./configure --prefix=$PREFIX --enable-all --enable-openmp \
--with-fftw3=/sw/numerics/fftw3/gcc.9.2.0/3.3.8/skl \
--with-fftw3-libdir=/sw/numerics/fftw3/gcc.9.2.0/3.3.8/skl/lib \
--with-fftw3-includedir=/sw/numerics/fftw3/gcc.9.2.0/3.3.8/skl/include

echo "Press ENTER to compile"; read ttt
make -j4
make check
echo "Press ENTER to install"; read ttt
make install

nocache

nocache - minimize caching effects in lustre filesystems

General Information

Use case: backup processes that should not interfere with the present state of the cache.

Use case: staging of large amount of data in a lustre file system before a parallel job

Usage at HLRN

nocache is found to resolve the lustre issue, where temporarly invalid files are produced by staging huge amount of data before a parallel job step.

Load the modulefile

$ module load nocache

This provides access to the script nocache and the binaries cachedel and cachestats. The corresponding man - pages become available.

Prepend nocache before file copy operations

$ nocache cp <source> <target>

Building nocache

Installation includes

download from github
run make and make install - see run_make in the installation path.

License conditions

See https://github.com/Feh/nocache/blob/master/COPYING

Numerics

BLAS — BLAS (Basic Linear Algebra Subprograms)
FFTW3 — A C-subroutine library for computing discrete Fourier transforms
GSL — The GNU Scientific Library (GSL)- a numerical library for C and C++ programmers
MUMPS — MUltifrontal Massively Parallel sparse direct Solver.
NFFT — Discrete Fourier transform (DFT) in one or more dimensions
ScaLAPACK — Scalable LAPACK
Scotch — Software package and libraries for sequential and parallel graph partitioning, static mapping, sparse matrix block ordering, and sequential mesh and hypergraph partitioning.
METIS – A set of serial programs for partitioning graphs, partitioning finite element meshes, and producing fill reducing orderings for sparse matrices.
ParMETIS – An MPI-based parallel library that implements a variety of algorithms for partitioning unstructured graphs, meshes, and for computing fill-reducing orderings of sparse matrices.
PETSc – Portable, Extensible Toolkit for Scientific Computation: widely used parallel numerical software library for partial differential equations and sparse matrix computations.

BLAS

BLAS (Basic Linear Algebra Subprograms)

Description

The BLAS (Basic Linear Algebra Subprograms) are routines that provide standard building blocks for performing basic vector and matrix operations.

For more information visit BLAS home page.

BLAS is currently available in different modules in HLRN:

Intel MKL (via intel/mkl module)
OpenBLAS

Both provide highly optimized BLAS routines. Additionally there is a (slow) reference lapack in /usr/lib64

Modules

Version	Installation Path	modulefile	compiler
0.3.7	/sw/numerics/openblas/0.3.7/skl	openblas/gcc.7/0.3.7	gcc/7.5.0
0.3.7	/sw/numerics/openblas/0.3.7/skl	openblas/gcc.8/0.3.7	gcc/8.3.0
0.3.7	/sw/numerics/openblas/0.3.7/skl	openblas/gcc.9/0.3.7	gcc/9.2.0

FFTW3

A C-subroutine library for computing discrete Fourier transforms

Description

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions,

Read more on the fftw3 home page For a manual consult the online manual or download the pdf

Versions

Version	Build Date	Installation Path	modulefile	compiler
Fftw3 /3.3.7	unknown	/cm/shared/apps/fftw/openmpi	fftw3/openmpi/gcc/64/3.3.7	gcc
Fftw3 /3.3.8	01-feb-2019	/sw/numerics/fftw3/gcc.7.5.0/3.3.8/		gcc/7.5.0
Fftw3 /3.3.8	01-feb-2019	/sw/numerics/fftw3/gcc.8.3.0/3.3.8/		gcc/8.3.0
Fftw3 /3.3.8	16-feb-2020	/sw/numerics/fftw3/ompi/gcc.9.2.0/3.3.8/	fftw3/ompi/gcc/3.3.8	gcc/9.2.0, openmpi/gcc.9/3.1.5
Fftw3 /3.3.8	01-feb-2019	/sw/numerics/fftw3/impi/gcc.9.2.0l/3.3.8/	fftw3/impi/gcc/3.3.8	gcc/9.2.0, impi/2018.5
Fftw3 /3.3.8	01-feb-2019	/sw/numerics/fftw3/impi/intel/3.3.8/	fftw3/impi/intel/3.3.8	intel/19.0.1 impi/2018.5

The single, long-double, omp and threads -enabled version is installed. The ompi and the impi - installation contain the serial libraries too. Both, the shared and the static version are available.

For fftw3 with INTEL compilers, please consider also to use the MKL! Read more

Modules and Usage at HLRN

The library is included in several software packages. A module file gives access to a few binaries, fftwf-wisdom, fftwl-wisdom. fftw-wisdom. fftw-wisdom-to-conf.

module show fftw3/version
module help fftw3/version

deliver details on paths and environmental variables.

module load fftw3/version

defines environmental variables:

PATH
LD_RUN_PATH
PKG_CONFIG_PATH
LIBRARY_PATH

The modules do not have any effect during runtime.

Precision

You may link to -lfftw3f (single) or -lfftw3l (long-double) instead of or in addition to -lfftw3 (double).

You can see all provided versions directly after you loaded a FFTW module with: ls ${LIBRARY_PATH%%:*}

Installation at HLRN

Fftw3 is build from source. The current version is build with several compilers. High end optimisation is used. Read more

All libraries passed through the basic checks.

The Fftw3-Installation at HLRN

Download:

fftw3 - downloads

Installation path:

/sw/numerics/fftw3/< mpi-version >/<compiler-version>/3.3.8/skl, untar and rename/move the directory to build

configure:

configure --help reveals the most important switches:

CC, FC, MPICC etc. to define the compiler
CFLAGS as slot for compiler options, do not forget -Wl,-rpath=$LD_RUN_PATH to burn the path to compiler libraries. intel/18.0.3 does not have the LD_RUN_PATH. The path the the fftw3-objects is burned in automatically by configure/make.
–enable-shared to build alos shared libraries
–enable-single, –enable-long-double to build for different numerical accuracy.
–enable-omp, –enable-threads, –enable-mpi

GSL

The GNU Scientific Library (GSL)- a numerical library for C and C++ programmers

Documentation

The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. It is free software under the GNU General Public License.

Read more on the GSL home page.

Versions

Version	Build Date	Installation Path	modulefile	compiler
GSL /2.5	05-feb-2019	/sw/numerics/gsl/2.5	gsl/gcc.8/2.5	gcc/8.2.0
GSL /2.5	05-feb-2019	/sw/numerics/gsl/2.5	gsl/2.5_intel.18	intel/18.0.5
GSL /2.5	05-may-2019	/sw/numerics/gsl/2.5/	gsl/2.5_intel.19	intel/19.0.3
GSL/2.7	17-jun–2021	/sw/numerics/gsl/*/2.7/	gsl/gcc.8/2.7	gcc/8.3.0

In addition, the libgslcbls and FORTRAN- interface FGSL (1.2.0) are provided. Both, the shared and the static version are available.

For a manual consult the online manual or download the pdf. For the FORTRAN interface see FGSL wikibook.

Modules and usage of GSL at HLRN

The library is included in several software packages. A module file gives access to a few binaries, gsl-config, gsl-histogram, gsl-randist.

module load gsl/version

sets environmental variables:

LD_RUN_PATH
PATH
MANPATH
INFOPATH
PKG_CONFIG_PATH

module help gsl/version

delivers the path to the libraries. Sophisticated tools and programmers use gsl-config of pkg-config.

Installation at HLRN

Documentation

The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. It is free software under the GNU General Public License.

Read more on the GSL home page.

Versions

Version	Build Date	Installation Path	modulefile	compiler
GSL /2.5	05-feb-2019	/sw/numerics/gsl/2.5	gsl/2.5_gcc.8.2	gcc/8.2.0
GSL /2.5	05-feb-2019	/sw/numerics/gsl/2.5	gsl/2.5_intel.18	intel/18.0.5
GSL /2.5	05-may-2019	/sw/numerics/gsl/2.5/	gsl/2.5_intel.19	intel/19.0.3

In addition, the libgslcbls and FORTRAN- interface FGSL (1.2.0) are provided. Both, the shared and the static version are available.

For a manual consult the online manual or download the pdf. For the FORTRAN interface see FGSL wikibook.

Modules and usage of GSL at HLRN

The library is included in several software packages. A module file gives access to a few binaries, gsl-config, gsl-histogram, gsl-randist.

module load gsl/version

sets environmental variables:

LD_RUN_PATH
PATH
MANPATH
INFOPATH
PKG_CONFIG_PATH

module help gsl/version

delivers the path to the libraries. Sophisticated tools and programmers use gsl-config of pkg-config. Installation at HLRN

Configure and make GSL

parentdir="$(dirname "$(pwd)")"
export PREFIX=$parentdir
echo "builing for "$PREFIX

module load intel/compiler/64/2019/19.0.1 
module load intel/tbb/64/2019/1.144  
module load intel/mkl/64/2019/1.144

export FC=ifort
export CC=icc
export CXX=icpc
export LD_RUN_PATH=$LD_LIBRARY_PATH

#export CFLAGS="-fPIC -O3 -Wl,-rpath=$LD_RUN_PATH"
export CFLAGS="-fPIC -O3 -fp-model strict -Wl,-rpath=$LD_RUN_PATH"
./configure --prefix=$PREFIX

echo "Press ENTER to compile"; read ttt
make -j4
echo "Press ENTER to check"; read ttt
make check
echo "Press ENTER to install"; read ttt
make install
echo "Press ENTER to clean "; read ttt
make clean

MUMPS

MUltifrontal Massively Parallel sparse direct Solver.

Description

MUMPS is a numerical software package for solving sparse systems of linear equations with many features.

Read more on MUMPS home page. For manual and user-guide visit the MUMPS User-guide page.

Modules

Version	Installation Path	modulefile	compiler	comment
5.2.1	/sw/libraries/mumps/5.2.1/skl	mumps/5.2.1	openmpi.3.1.2-gcc.8.2.0
5.2.1	/sw/libraries/mumps/5.2.1/skl	mumps/5.2.1	openmpi.3.1.5-gcc.9.2.0

NFFT

Discrete Fourier transform (DFT) in one or more dimensions

General Information

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions,

Read more on the nfft home page For a manual consult the online manual or download the pdf

Versions

Version	Build Date	Installation Path	modulefile	compiler
3.5.1	20-feb-2020	/sw/numerics/nfft/gcc.8.3.0/3.5.1/		gcc/8.3.0
3.5.1	20-feb-2020	/sw/numerics/nfft/gcc.9.2.0/3.5.1/	nfft/gcc/3.5.1	gcc/9.2.0
3.5.1	20-feb-2020	/sw/numerics/nfft/intel.19/3.5.1/	nfft/intel/3.5.1	intel/19.0.1

All types of interface are installed. Both, the shared and the static version are available.

Usage of Nfft at HLRN

$ module load nfft/version

It sets:

LD_RUN_PATH
PKG_CONFIG_PATH

It load the related fftw3-module.

$ module show nfft/version

delivers the path to the libraries.

Installation at HLRN

Nfft is build from source. The current version is build with several compilers. High end optimisation is used. read more Description

Building NFFT

Intel

module load intel/19.0.5
module load fftw3/intel/3.3.8
export CC=icc
export CXX=icpc
export F77=ifort
export COMPARCH="-xCORE-AVX512 -qopt-zmm-usage=high"
parentdir="$(dirname "$(pwd)")"
export BUILDDIR=$parentdir
echo "building in "$BUILDDIR
export PREFIX=$BUILDDIR
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa
export CFLAGS=" -fPIC -O3 $COMPARCH -Wl,-rpath=$LD_RUN_PATH" 
export CXXFLAGS="-fPIC -O3 $COMPARCH -Wl,-rpath=$LD_RUN_PATH" 
export LIBS="-L/sw/numerics/fftw3/intel/3.3.8/skl/lib/" 
./configure --prefix=$PREFIX --enable-all --enable-openmp \ 
  --with-fftw3=/sw/numerics/fftw3/intel/3.3.8/skl \ 
  --with-fftw3-libdir=/sw/numerics/fftw3/intel/3.3.8/skl/lib \ 
  --with-fftw3-includedir=/sw/numerics/fftw3/intel/3.3.8/skl/include 

echo "Press ENTER to compile"; read ttt 
make -j4 
make check 
echo "Press ENTER to install"; read ttt 
make install 
echo "Do not forget to make clean"

GCC

module load gcc/9.2.0
module load fftw3/gcc.9/3.3.8
export CC=gcc
export CXX=g++
export F77=gfortran
export COMPARCH="-march=skylake-avx512"

parentdir="$(dirname "$(pwd)")"
export BUILDDIR=$parentdir
echo "building in "$BUILDDIR

export PREFIX=$BUILDDIR
echo "building for "$PREFIX
echo "Press ENTER to configure";read aaa

export CFLAGS="  -fPIC -O3 $COMPARCH -Wl,-rpath=$LD_RUN_PATH"
export CXXFLAGS="-fPIC -O3 $COMPARCH -Wl,-rpath=$LD_RUN_PATH"
export LIBS="-L/sw/numerics/fftw3/gcc.9.2.0/3.3.8/skl/lib/"

./configure --prefix=$PREFIX --enable-all --enable-openmp \
--with-fftw3=/sw/numerics/fftw3/gcc.9.2.0/3.3.8/skl \
--with-fftw3-libdir=/sw/numerics/fftw3/gcc.9.2.0/3.3.8/skl/lib \
--with-fftw3-includedir=/sw/numerics/fftw3/gcc.9.2.0/3.3.8/skl/include

echo "Press ENTER to compile"; read ttt
make -j4
make check
echo "Press ENTER to install"; read ttt
make install

ScaLAPACK

Scalable LAPACK

Description

ScaLAPACK is a subset of LAPACK routines redesigned for distributed memory MIMD parallel computer.

For more information visit ScaLAPACK home page.

BLAS is currently available in different modules in HLRN:

Intel MKL (via intel/mkl module)
ScaLAPACK

Both provide highly optimized BLAS routines. Additionally there is a (slow) reference lapack in /usr/lib64

Modules

Version	Installation Path	modulefile	compiler
2.1.0	/sw/numerics/scalapack/2.1.0/skl	scalapack/gcc.9/2.1.0	gcc-9.2.0

Scotch

Software package and libraries for sequential and parallel graph partitioning, static mapping, sparse matrix block ordering, and sequential mesh and hypergraph partitioning.

Description

Scotch is a numerical software package and libraries for sequential and parallel graph partitioning, static mapping, and sparse matrix block ordering, and sequential mesh and hypergraph partitioning. Read more on Scotch home page.

Modules

Version	Installation Path	modulefile	compiler	comment
6.0.7	/sw/libraries/scotch/6.0.7/skl	scotch/6.0.7_esmumps	openmpi-3.1.2-gcc.8.2.0	with MUMPS interfaces
6.0.7	/sw/libraries/scotch/6.0.7/skl	scotch/6.0.7_esmumps	openmpi.3.1.5-gcc.9.2.0	with MUMPS interfaces

Octave

Description

Prerequisites

Modules

Example Jobscripts

#!/bin/bash
#SBATCH -p
 
module load

Octopus

Description

Octopus is free software, released under the GPL license.

More information about the program and its usage can be found on https://www.octopus-code.org/

Modules

The most recent compiled version is 12.1, and it has been built using with the intel-oneapi compiler (v. 2021.2) and linked to Intel MKL (including FFTW).

The octopus module depends on intel/2021.2 and impi/2021.7.1.

Example Jobscripts

Assuming that your input file inp is located within the directory where you are submitting the jobscript, and that the output is written to out, one example of jobscript is given below

#!/bin/bash
#SBATCH --time 12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=octopus
 
module load intel/2021.2 impi/2021.7.1 octopus/12.1
 
# Set the number of OpenMP threads as given by the SLURM parameter "cpus-per-task"
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
  
# Adjust the maximum stack size of OpenMP threads
export OMP_STACKSIZE=512m
 
# Do not use the CPU binding provided by slurm
export SLURM_CPU_BIND=none
  
# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close
  
# Binding MPI tasks
export I_MPI_PIN=yes
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN_CELL=core
 
mpirun octopus

pigz

A parallel implementation of gzip for modern multi-processor, multi-core machine

Description

pigz is a fully functional replacement for gzip that exploits multiple processors and multiple cores to the hilt when compressing data.

Read more on pigz home page. For User Manual visit the pigz documentation.

Modules

Version	Installation Path	modulefile	compiler	comment
2.4	/sw/tools/pigz/2.4/skl	pigz/2.4	gcc.8.2.0	Gö
2.4	/sw/tools/pigz/2.4/skl	pigz/2.4	gcc.9.2.0	B

PROJ

Cartographic Projections Library

General Information

Vendor: USGS Installation Path: /sw/dataformats/proj/< version >

Version	build date	compiler	remark
6.2.1	04/2019	intel-18
7.1.0	08/2020	gcc-8
9.2.1	04/2022	gcc-9	default

A library is delivered for map projections of gridded data, used for example in CDO.

Additional binaries are available: cs2cs, geod, invgeod, invproj, nad2bin, proj

When a proj-module is loaded man-pages for cs2cs, geod, proj, pj_init are available.

Versions, modulefiles and environment variables

Type

module avail proj

for a list of available versions.

The module sets the path to the binaries.

Quantum ESPRESSO

Description

Documentation and other material can be found on the QE website.

Prerequisites

QE is free software, released under the GNU General Public License (v2). Scientific work done using the QE code should contain citations of corresponding QE references.

Only members of the q_espr user group have access to QE executables provided by HLRN. To have their user ID included in this group, users can send a message to their consultant or to NHR support.

Modules

QE version	QE modulefile	QE requirements
6.4.1	qe/6.4.1	impi/* (any version)

Job Script Examples

For Intel Cascade Lake compute nodes – plain MPI case (no OpenMP threading) of a QE job using a total of 1152 CPU cores distributed over 12 nodes, 96 tasks each. Here 3 pools (nk=3) are created for k-point parallelization (384 tasks per k-point), 3D-FFT is performed using 8 task groups (48 tasks each, nt=8).

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -N 12
#SBATCH --tasks-per-node 96
  
module load impi/2018.5
module load qe/6.4.1
 
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=1
 
mpirun pw.x -nk 3 -nt 8 -i inputfile > outputfile

R

R - statistical computing and graphics

Description

Prerequisites

For the installation of R-packages by users with the help of rstudio or Rscript, the appropriate compiler module must be loaded in addition to the R-module.

R at HLRN

Modules

Before starting R, load a modulefile

module load R/version

This provides access to the script R that sets up an environment and starts the R-binary. The corresponding man - and info pages become available.

Info pages: R-admin, R-data, R-exts, R-intro, R-lang, R-admin, R-FAQ, R-ints

This is possible, but resources and runtime are limited. Be friendly to other users and work on the shared compute nodes!

Running R on the compute nodes

Allocate capacity in the batch system, and log onto the related node:

$ salloc -N 1 -p large96:shared
$ squeue --job <jobID>

The output of salloc shows your job ID. With squeue you see the node you are going to use. Login with X11-forwarding:

$ ssh -X <nodename>

Load a module file and work interactively as usual. When ready, free the resources:

$ scancel <jobID>

You may also use srun:

$ srun -v -p large96:shared --pty --interactive bash

Do not forget to free the resources when ready.

R packages

List of installed R packages

The following packages are installed by default, when a new version of R is build. Please contact support to extend this list.

Package list for 3.5.1

Users may request package installation via support or install in their HOME - directory.

Building R-packages - users approach

Building R-packages - administrators approach

R administrators may use rstudio or Rscript for installation. For installing packages in /sw/viz/R it is suggested, to use Rscript like

$ Rscript -e 'install.packages("'$package'",repos="'$REPOSITORY'",INSTALL_opts="--html")'

Using INSTALL_opts="–html" keeps documentation of installed packages up to date!

install_packages,
install_cran
install_github
install_bioc
remove_package,
sync_wiki

Here also the workarounds are collected needed to install stiff packages, whose developers do not care and do not support all Rscript options.

The R-project homepage

Installing R and R-packages

Installing R

Make a new directory. It is strongly suggested, to follow the form /sw/viz/R/R-4.0.2/skl. Copy the installation scripts into this directory. Do not forget, other members of the sw-group may need to install here too.
```
$ cd /sw/viz/R/R-4.0.2/skl
$ cp ../../scripts/* .
$ chmod g+rwX *
```
Edit the script getR and update the version number. Run getR. This downloads the requested version of R, inflates und unpacks the tar-file and renames the source directory into build. Note, if you download otherwise, please rename the R-directory into build, if you want to use the script install_R.
```
$ ./get_R
```
Check or edit the script install_R. You may change there:
- the default compiler to be used for building R and R packages. This compiler needs to be compatible with the compiler used for external packages like netcdf, openmpi, magick etc. If you change the default compiler, please change the other scripts too - see below.
- the options for running configure
- compiler options, especially the degreee of optimisation and rpath. These options will be part of the default settings to be used by Rscript when building packages.
Run install_R. The script will stop several times requesting ENTER to continue. This helps to see, if the configuration, compilation and all checks are finished with reasonable output. Note, the script changes permissions in the end, to allow other members of the sw-group to install there too.
```
$ ./install_R
```
Produce a module file. It should be sufficient, to copy that of the previous version and to change the version number in the file.

Single R-package installation

Package installation may be done with the help of Rscript or directly from R or rstudio. We recommend to use our scripts instead, since

they load the default compiler compatible with R
contain fixes and workarounds needed to install some packages
set the permissions for the installed files, which is often forgotten
help with bookkeeping of package lists.

The scripts produce a one line script executed immediately by Rscript. It employs mainly the R-libraries install.packages (not to mix with our script with the same name!) or githubinstall.

We support three different repositories:

cran ( https://cran.uni-muenster.de/ ): the main repository for releases of R package
```
$ ./install_cran <packagename>
```
github: the detailed search for the package is done with githubinstall used by R.
```
$ ./install_github <packagename>
```
BiocManager ( https://www.bioconductor.org/install ): a specific repository providing tools for the analysis and comprehension of high-throughput genomic data.
```
$ ./install_bioc <packagename> <version>
```
The version is optional.

To remove a package, you may use

$ ./remove_package <packagename>

Automatic removal from the list of default packages is not implemented yet.

Default R-packages

To install the bundle of predefined R-packages, use the script install_packages. It

loads the appropriate compiler module
employs install_cran, install_github and install_bioc to install the packages
sets the correct permissions for the installed files and libraries
documents success or failure of the installation

$ ./install_packages

Using install_packages is coffee consuming. The script needs several hours to install the full list of packages. Some prerequisites are installed several times. Known issues

Rmpi is linked with openmpi. The intel compilers and impi are not tested yet.

RELION

Description

Modules

Version	Installation Path	modulefile	compiler	comment
3.0	/sw/chem/relion/3.0/skl/gcc/openmpi	relion/3.0-gcc	gcc.8.2-openmpi.3.1.2	GUI enabled
3.0-beta	/sw/chem/relion/3.0beta/skl	relion/3.0-beta	intel-impi-2018

To use RELION in HLRN

#Load the modulefile
module load relion/3.0-gcc
#Launching GUI
relion_maingui

ScaLAPACK

Scalable LAPACK

Description

ScaLAPACK is a subset of LAPACK routines redesigned for distributed memory MIMD parallel computer.

For more information visit ScaLAPACK home page.

BLAS is currently available in different modules in HLRN:

Intel MKL (via intel/mkl module)
ScaLAPACK

Both provide highly optimized BLAS routines. Additionally there is a (slow) reference lapack in /usr/lib64

Modules

Version	Installation Path	modulefile	compiler
2.1.0	/sw/numerics/scalapack/2.1.0/skl	scalapack/gcc.9/2.1.0	gcc-9.2.0

Scotch

Software package and libraries for sequential and parallel graph partitioning, static mapping, sparse matrix block ordering, and sequential mesh and hypergraph partitioning.

Description

Modules

Version	Installation Path	modulefile	compiler	comment
6.0.7	/sw/libraries/scotch/6.0.7/skl	scotch/6.0.7_esmumps	openmpi-3.1.2-gcc.8.2.0	with MUMPS interfaces
6.0.7	/sw/libraries/scotch/6.0.7/skl	scotch/6.0.7_esmumps	openmpi.3.1.5-gcc.9.2.0	with MUMPS interfaces

STAR-CCM+

A Package for Computational Fluid Dynamics Simulations

General Information

Producer: Siemens PLM Software (formerly CD-adapco Group)

Note

Modules

The following tables lists installed STAR-CCM+ versions.

Version	Module File	Remarks
14.02.012-R8	starccm/12.04.012-r8	double precision version
14.04.011-R8	starccm/14.04.011-r8	double precision version

Note

The module name is starccm. Other versions may be installed. Inspect the output of :
module avail starccm

Functionality

Aerospace Engineering
Turbomachinery
Chemical Process Engineering
Automotive Engineering
Building and Environment Engineering

Conditions for Usage and Licensing

All usage of STAR-CCM+ products at HLRN is strictly limited to teaching and academic research for non-industry funded projects only.

Note

Details of the HLRN Installation of STAR-CCM+

STAR-CCM+ is installed below /sw/eng/starccm/. We provide module files which make all environment settings for the use of a specific STAR-CCM+ version.

STAR-CCM+ products come with complete documentation. The User Guide is available in PDF format, see directory /sw/eng/starccm/<version>/STAR-CCM+<version>/doc.

Example Jobscripts

#!/bin/bash
#SBATCH -t 00:10:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=40
#SBATCH -p medium
#SBATCH --mail-type=ALL
#SBATCH --job-name=StarCCM
 
module load starccm/14.04.011-r8
 
## create the host list for starccm+
srun hostname -s | sort | uniq -c | awk '{ print $2":"$1 }' > starhosts.${SLURM_JOB_ID}
 
export CDLMD_LICENSE_FILE=<port@licenseserver>
export PODKEY=<type your podkey here>
export MYCASE=<type your sim file name>
 
## run starccm+
starccm+ -dp -np ${SLURM_NTASKS} -batch ${MYCASE} \
 -power -podkey ${PODKEY} -licpath ${CDLMD_LICENSE_FILE} \
 -machinefile starhosts.${SLURM_JOB_ID} -mpi intel
 
echo '#################### StarCCM+ finished ############'
rm starhosts.$SLURM_JOB_ID

Note

Despite the fact that -machinefile starhosts.$SLURM_JOB_ID is used, you have to specify the number of worker processes (-np).

Tutorial Cases for STAR-CCM+

Szip

Szip, fast and lossless compression of scientific data

Documentation

Szip is a freeware portable general purpose lossless compression program. It has a high speed and compression, but high memory demands too.

The Szip library is now replaced by the aec library.

Download the code from HDF Group.

Versions

Version 2.2.1 is installed for all relevant compilers. Find the library in /sw/dataformats/szip/2.1.1/skl. Note the license restriction.

License

Szip may be used for scientific purposes in conjunction with HDF data handling. read more.

Modules

There is no module file yet.

Building

The szip - libraries are build with autotools. High optimisation is enabled, all tests are passed. Please see the file run_configure ind the build -directory.

UDUNITS2

Unidata UDUNITS2 Package, Conversion and manipulation of units

Description

Vendor: Unidata Installation Path: /sw/dataformats/udunits/

Version	compiler
2.2.26	gcc-7
2.2.26	gcc-8
2.2.26	intel
2.2.28	gcc-9
2.2.28	intel.22

The udunits home page.
If an udunits module is loaded an info page is available for the keywords udunits2, udunits2lib and udunits2prog.

Modules

To activate the package, issue module load udunits from the command line or put this line into your login shell. For more versions see module avail udunits

Examples

To activate udunits type

module load udunits/2.1.24_intel

Direct calls of the udunits binary will be of minor importance. After loading the module, one may try

udunits
You have: J
You want: cal   
    <cal> = <J>*0.238846
    <cal> = <J>/4.1868

VASP

Description

More information is available on the VASP website and from the VASP wiki.

Usage Conditions

Access to VASP executables is restricted to users satisfying the following criteria. The user must

be member of a research group owning a VASP license,
be registered in Vienna as a VASP user of this research group,
employ VASP only for work on projects of this research group.

Modules

VASP Version	User Group	VASP Modulefile	MPI Requirement	CPU / GPU	Lise / Emmy
5.4.4 with patch 16052018	vasp5_2	vasp/5.4.4.p1	impi/2019.5	✅ / ❌	✅ / ✅
6.4.1	vasp6	vasp/6.4.1	impi/2021.7.1	✅ / ❌	✅ / ❌
6.4.1	vasp6	vasp/6.4.1	nvhpc-hpcx/23.1	❌ / ✅	✅ / ❌
6.4.2	vasp6	vasp/6.4.2	impi/2021.7.1	✅ / ❌	✅ / ❌

Executables

Executable	Description
vasp_std	multiple k-points (formerly vasp_cd)
vasp_gam	Gamma-point only (formerly vasp_gamma_cd)
vasp_ncl	non-collinear calculations, spin-orbit coupling (formerly vasp)
vaspsol_[std	gam
vasptst_[std	gam
vasptstsol_[std	gam

N.B.: The VTST script collection is not available from the vasp environment modules. Instead, it is provided by the vtstscripts environment module(s).

Example Jobscripts

#!/bin/bash
#SBATCH --time 12:00:00
#SBATCH --nodes 2
#SBATCH --tasks-per-node 40
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load vasp/5.4.4.p1
 
mpirun vasp_std

#!/bin/bash
#SBATCH --time 12:00:00
#SBATCH --nodes 2
#SBATCH --tasks-per-node 96
 
export SLURM_CPU_BIND=none
 
module load impi/2019.5
module load vasp/5.4.4.p1
 
mpirun vasp_std

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=2
#SBATCH --tasks-per-node=48
#SBATCH --cpus-per-task=2
#SBATCH --partition=standard96
 
export SLURM_CPU_BIND=none
 
# Set the number of OpenMP threads as given by the SLURM parameter "cpus-per-task"
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
 
# Adjust the maximum stack size of OpenMP threads
export OMP_STACKSIZE=512m
 
# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close
 
# Binding MPI tasks
export I_MPI_PIN=yes
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN_CELL=core
 
module load impi/2021.7.1
module load vasp/6.4.1  
 
mpirun vasp_std

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=2
#SBATCH --tasks-per-node=4
#SBATCH --cpus-per-task=18
#SBATCH --partition=gpu-a100
 
# Set the number of OpenMP threads as given by the SLURM parameter "cpus-per-task"
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
 
# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close
 
# Avoid hcoll as MPI collective algorithm
export OMPI_MCA_coll="^hcoll"
 
# You may need to adjust this limit, depending on the case
export OMP_STACKSIZE=512m
 
module load nvhpc-hpcx/23.1
module load vasp/6.4.1 
 
# Carefully adjust ppr:2, if you don't use 4 MPI processes per node
mpirun --bind-to core --map-by ppr:2:socket:PE=${SLURM_CPUS_PER_TASK} vasp_std

Visualization tools

GraDS — An interactive desktop tool for easy access, manipulation, and visualization of earth science data
NCL
NcView — Ncview - a visual browser for netCDF formated files.
pyfesom2 — Python library and tools for handling of FESOM2 ocean model output

GraDS

An interactive desktop tool for easy access, manipulation, and visualization of earth science data

General information

The Grid Analysis and Display System (GrADS) is an interactive desktop tool that is used for easy access, manipulation, and visualization of earth science data.

Documentation

Detailed description is found at the GraDS home page,

Versions

2.0.2.0b, 2.2.1

Modulefiles and environmental variables

NCL

NCAR Command Languge and NCAR Graphics

General information

NCL is an interpreted language designed specifically for scientific data analysis and visualization

Documentation

Detailed description is found at the NCL home page,
use option –help to see an overview of relevant options

Usage at HLRN

Modulefiles and environmental variables

load a module to activate the path to the binaries
loading the module files implies setting NCARG_ROOT

Program start

aftern loading the module, start issuing ncl
the following binaries are provided:

cgm2ncgm fcaps idt ncargf90 ncarlogo2ps ncl_filedump nhlf90 pswhite rassplit tlocal ConvertMapData findg MakeNcl ncargfile ncarvversion ncl_grib2nc nnalg pwritxnt rasstat WRAPIT ctlib fontc med ncargpath ncgm2cgm ncl.xq.fix pre2ncgm ras2ccir601 rasview wrapit77 ctrans gcaps ncargcc ncargrun ncgmstat ng4ex pre2ncgm.prog rascat scrip_check_input WriteLineFile ESMF_!RegridWeightGen graphc ncargex ncargversion ncl nhlcc psblack rasgetpal tdpackdemo WriteNameFile ezmapdemo ictrans ncargf77 ncargworld ncl_convert2nc nhlf77 psplit rasls tgks0a WritePlotcharData

HLRN specific installation

installation from predefined binary. The binary is downloaded from download page

Linked netcdf-library versions: netcdf 4.6.1 opendap enabled

NcView

Ncview - a visual browser for netCDF formated files.

Description

Ncview is a visual browser for netCDF format files.

Documentation

Issue ncview without an option shows the list of command line options
there is a man page
The Ncview HOME-page

Usage of the Ncview at HLRN

To activate the package, issue

module load ncview

Start ncview with

ncview

Installation

see the file run_configure in the build-directory for configuration and installation steps.
man pages and app-defaults do are not installed by the install. This is done by hand. The app-defaults defined in Ncview are not found.

pyfesom2

Python library and tools for handling of FESOM2 ocean model output

Description

Pyfesom2 provides a handy Python library and a collection of tools for basic handling of FESOM2 ocean model output. Please see https://pyfesom2.readthedocs.io for documentation and https://github.com/FESOM/pyfesom2 for the source code.

Prerequisites

Pyfesom2 is free software, no prerequisites are needed.

Modules

The Python installation providing pyfesom2 is made available with “module load pyfesom2”.

Wannier90

Description

The version 2.1.0 of Wannier90 is available on Lise and Emmy. For the document, please access: https://www.wannier.org/

Prerequisites

Intel MPI: 2019 or newer.

Modules

Applications

You can find a full list of available software under the available modules.

The applications listed here have separate instructions and how to guides.

Ansys Suite

The full Ansys Academic Multiphysics Campus Solution is available, e.g. Mechanical, CFX, Fluent, LS-Dyna, Electronic, SCADE (but not Lumerical, GRANTA).

To see if (new) software products are included in Ansys please check Ansys Academic Product Reference Guide.

Below you find explanations to obtain and check out product licenses and regarding support and training.

For additional information and minimal examples regarding specific products see:

CFX — Computational fluid dynamics solver focused on turbo-machinery (vertex-centered FVM)
Fluent — General computational fluid dynamics solver (cell-centered FVM). GPUs are supported.
LS-DYNA — Crash Simulation Software - finite element and multiphysics program to analyse the nonlinear response of structures
Mechanical — Ansys Mechanical Package (coupled physics simulation) including Autodyn, Fatigue Module, Asas, Aqwa, etc.

Introduction and courses

Note

This documentation only covers the specifics of the usage of ANSYS on our system. General introductory courses as well as courses for special topics are offered by ANSYS Inc. and their regional offices. We recommend taking an introductory course first. Good (free) starting points for self-study are https://students.cadfem.net/de/ansys-gratis-lernen.html and https://courses.ansys.com.

Documentation and materials

Note

If you are a member of the Ansys user group (see details how to become a member below under Usage and Licensing) you can access on blogin:

the official PDF documentation: /sw/eng/ansys_inc/v231/doc_manuals/v231
tutorials: /sw/eng/ansys_inc/v231/doc_tutorials

Usage and licensing

Warning

Academic use only
The use of Ansys is restricted to members of the Ansys user group.

You can apply to become a group member at nhr-support@gwdg.de. You must fulfill the Ansys license conditions. In short: our academic license is restricted to research, student instruction, student projects and student demonstrations. It cannot be used in projects that are financed by industrial partners.
To check if you are a group member you can type: groups

Info

Slurm flag
Always add #SBATCH -L ansys to your job script.

The flag “#SBATCH -L ansys” ensures that the scheduler starts jobs only, when licenses are available. You can check the availability yourself: scontrol show lic

aa_r is a “ANSYS Academic Research License” with 4 inclusive tasks. Research jobs with more than 4 tasks cost additional “aa_r_hpc” licenses.
aa_t_a is a “ANSYS Academic Teaching License” with a maximum of 4 tasks. These may be used only for student projects, student instruction and student demonstrations. Eligible users are allowed to activate these, by adding the flag
-lpf $ANSYSLIC_DIR/prodord/license.preferences_for_students_and_teaching.xml
to the Ansys executable. The path $ANSYSLIC_DIR is provided after loading any Ansys module.

Installed versions

To see available versions type: module avail ansys

CFX

Computational fluid dynamics solver focused on turbo-machinery (vertex-centered FVM)

General Information

To obtain and checkout a product license please read Ansys Suite first.

Documentation and Tutorials

Besides the official documentation and tutorials (see Ansys Suite), another alternative source is: https://cfd.ninja/tutorials

Example Jobscripts

Example input files e.g. StaticMixer.def can be found at $CFXROOT/examples (after loading the Ansys module).

#!/bin/bash
#SBATCH -t 00:10:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=40
#SBATCH -L ansys
#SBATCH -p medium96s
#SBATCH --mail-type=ALL
#SBATCH --job-name=StaticMixer

module load ansys/2023.2

## create list of hosts in calculation
srun hostname -s > hostlist.$SLURM_JOB_ID

## format the host list for cfx
cfxhostlist=`tr '\n' ',' < hostlist.$SLURM_JOB_ID`

echo $cfxhostlist

# start the solver
cfx5solve -def StaticMixer.def -start-method "Intel MPI Distributed Parallel" \
-double -affinity "explicit" -par-dist "$cfxhostlist"

echo '#################### CFX finished ############'
sleep 2
rm hostlist.$SLURM_JOB_ID

#!/bin/bash
#SBATCH --time=01:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16   
#SBATCH -L ansys
#SBATCH --partition=standard96:test
#SBATCH --job-name=testjob

hostlist_per_cfx5solve=$SLURM_JOB_NODELIST"*8"
echo "hostlist_per_cfx5solve "$hostlist_per_cfx5solve

module load ansys/2023.2
# cfx5solve -help

# start the solver
cfx5solve -def StaticMixerA.def -start-method "Intel MPI Distributed Parallel" \
        -double -par-dist "$hostlist_per_cfx5solve" -name $SLURM_JOB_NAME.$SLURM_JOB_ID.a &

echo "first cfx5solve is running in background"

cfx5solve -def StaticMixerB.def -start-method "Intel MPI Distributed Parallel" \
        -double -affinity "explicit" -par-dist "$hostlist_per_cfx5solve" -name $SLURM_JOB_NAME.$SLURM_JOB_ID.b

# wait for all children processes (background jobs) to finish
wait

echo '#################### CFX finished ############'

Fluent

General computational fluid dynamics solver (cell-centered FVM). GPUs are supported.

General Information

To obtain and checkout a product license please read Ansys Suite first.

Documentation and Tutorials

Besides the official documentation and tutorials (see Ansys Suite), another alternative source is: https://cfd.ninja/tutorials

Example Jobscripts

The underlying test case are

natural convection / circulation: described here, cas file: NaturalConvection_SimulationFiles.zip
steady nozzle flow: described in Fluent tutorial guide (2023 R1, Ch. 8) “Modeling Transient Compressible Flow”, cas file: nozzle_gpu_supported.cas.h5

#!/bin/bash
#SBATCH -t 00:50:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=20
#SBATCH -p medium96s 
#SBATCH -L ansys
#SBATCH --mail-type=ALL
#SBATCH --output="cavity.log.%j"
#SBATCH --job-name=cavity_on_cpu

module load openmpi ansys/2023.2

srun hostname -s | sort | uniq -c | awk '{printf $2":"$1","}' > hostfile

echo "Running on nodes: ${SLURM_JOB_NODELIST}"

fluent 2d -g -t${SLURM_NTASKS_PER_NODE} -ssh  -mpi=openmpi -pib -cnf=hostfile << EOFluentInput >cavity.out.$SLURM_JOB_ID
      ; this is an Ansys journal file aka text user interface (TUI) file
      file/read-case initial_run.cas.h5
      parallel/partition/method/cartesian-axes 2
      file/auto-save/append-file-name time-step 6
      file/auto-save/case-frequency if-case-is-modified
      file/auto-save/data-frequency 10
      file/auto-save/retain-most-recent-files yes
      solve/initialize/initialize-flow
      solve/iterate 100
      exit
      yes
EOFluentInput

echo '#################### Fluent finished ############'

#!/bin/bash
#SBATCH -t 00:59:00
#SBATCH --nodes=1
#SBATCH --partition=grete
#SBATCH --ntasks-per-node=1
#SBATCH --gres=gpu:1             # number of GPUs per node - ignored if exclusive partition with 4 GPUs
#SBATCH --gpu-bind=single:1      # bind each process to its own GPU (single:<tasks_per_gpu>)
#SBATCH -L ansys
#SBATCH --output="slurm-log.%j"

module load gcc ansys/2023.2
hostlist=$(srun hostname -s | sort | uniq -c | awk '{printf $2":"$1","}')
echo "Running on nodes: $hostlist"

cat <<EOF >tui_input.jou
file/read-cas nozzle_gpu_supported.cas.h5
solve/initialize/hyb-initialization
solve/iterate 100 
file/write-case-data outputfile1
file/export cgns outputfile2 full-domain yes pressure temperature x-velocity y-velocity mach-number
quit
exit
EOF

fluent 3ddp -g -cnf=$hostlist -t${SLURM_NTASKS} -gpu -nm -i tui_input.jou \
       -mpi=openmpi -pib -mpiopt="--report-bindings --rank-by core" >/dev/null 2>&1
echo '#################### Fluent finished ############'

#!/bin/bash
#SBATCH -t 00:10:00
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
#SBATCH --gres=gpu:4
#SBATCH -L ansys
#SBATCH -p grete
#SBATCH --output="slurm.log.%j"
#SBATCH --job-name=cavity_on_gpu

module load ansys/2023.2

hostlist=$(srun hostname -s | sort | uniq -c | awk '{printf $2":"$1","}')
echo "Running on nodes: $hostlist"

cat <<EOF >fluent.jou
; this is an Ansys journal file aka text user interface (TUI) file
parallel/gpgpu/show
file/read-case initial_run.cas.h5
solve/set/flux-type yes
solve/iterate 100
file/write-case-data outputfile
exit
EOF

fluent 2d -g -t${SLURM_NTASKS} -gpgpu=4 -mpi=openmpi -pib -cnf=$hostlist -i fluent.jou  >/dev/null 2>&1
echo '#################### Fluent finished ############'

Your job can be offloaded if parallel/gpgpu/show denotes the selected devices with a “(*)”. Your job was offloaded successfully if the actual call of you solver prints “AMG on GPGPU”.

Note

Ansys only supports certain GPU vendors/models: https://www.ansys.com/it-solutions/platform-support/previous-releases Look here for the PDF called “Graphics Cards Tested” of your version… (most Nividia, some AMD)

Note

The number of CPU-cores (e.g. ntasks-per-node=Integer*GPUnr) per node must be an integer multiple of the GPUs (e.g. gpgpu=GPUnr) per node.

Fluent GUI: to setup your case at your local machine

Unfortunately, the case setup is most convenient with the Fluent GUI only. Therefore, we recommend doing all necessary GUI interactions on your local machine beforehand. As soon as the case setup is complete (geometry, materials, boundaries, solver method, etc.), save it as a *.cas file. After copying the *.cas file to the working directory of the supercomputer, this prepared case (incl. the geometry) just needs to be read [file/read-case], initialized [solve/initialize/initialize-flow], and finally executed [solve/iterate]. Above, you will find examples of *.jou (TUI) files in the job scripts.

Iff you cannot set up your case input files *.cas by other means you may start a Fluent GUI as a last resort on our compute nodes. But be warned: to keep fast/small OS images on the compute node there is a minimal set of graphic drivers/libs only; X-window interactions involve high latency.

srun -N 1 -p standard96:test -L ansys --x11 --pty bash
 
# wait for node allocation, then run the following on the compute node
 
export XDG_RUNTIME_DIR=$TMPDIR/$(basename $XDG_RUNTIME_DIR); mkdir -p $XDG_RUNTIME_DIR
module load ansys/2023r1
fluent &

LS-DYNA

Crash Simulation Software - finite element and multiphysics program to analyse the nonlinear response of structures

Support and Examples

Note

LS-DYNA specific support is provided by www.dynasupport.com
For official examples visit: www.dynaexamples.com

General Information

To obtain and checkout a product license please read Ansys Suite first.

executable / flag	meaning
lsdyna	(without flags) single precision, shared memory parallel
-dp	double precision
-mpp -dis -np 96	distributed memory parallel on single 96-core-node
-mpp -dis -machines 1st_node_name:96:2nd_node_name:96	distributed memory parallel on multiple 96-core-nodes

To explore other settings, we recommend the general Ansys launcher GUI (start an interactive job as described in Quickstart, call “launcher” on compute node after loading the Ansys module). Running the (test) job from this GUI will print out your specialized terminal command. Then you can copy the resulting command/flags to your specific Slurm job script file. To stop an interactive LS-DYNA job press CTRL+C and then “stop”+ENTER.

Example Jobscript

The underlying input file PipeWhip.k can be found at www.dynaexamples.com/introduction/Introduction/example-26

#!/bin/bash
#SBATCH -t 01:00:00
#SBATCH --nodes=2
#SBATCH --tasks-per-node=48
#SBATCH -L ansys
#SBATCH -p standard96s:test
#SBATCH --job-name=Whip

# format the machinelist
machines=""
for i in $(scontrol show hostnames=$SLURM_JOB_NODELIST); do
        machines=$machines,$i:$SLURM_NTASKS_PER_NODE
done
machines=${machines:-1}

srun hostname -s | sort | uniq -c | awk '{printf $2":"$1","}' > hostfile
machines=`cat hostfile`
echo "Running on nodes: $machines"

module load ansys/2023.2

lsdyna pr=dyna I=ex_26_thin_shell_elform_16.k -dp -dis -machines=$machines

echo "############### ANSYS LS-DYNA finished ################"

Mechanical

Ansys Mechanical Package (coupled physics simulation) including Autodyn, Fatigue Module, Asas, Aqwa, etc.

General Information

To obtain and checkout a product license please read Ansys Suite first.

Documentation and Tutorials

Note

To access the official documentation and tutorials, read Ansys Suite. More examples can be found and downloaded for example at: https://courses.ansys.com/index.php/courses/getting-started-with-ansys-mechanical/lessons/how-to-navigate-the-ansys-mechanical-ui-lesson-2

Example Jobscripts

#!/bin/bash
#SBATCH -t 01:00:00
#SBATCH --nodes=2
#SBATCH --tasks-per-node=40
#SBATCH -L ansys
#SBATCH -p medium96s:test
#SBATCH --mail-type=ALL
#SBATCH --job-name=casting

module load ansys/2023.2

ansys232 -np $SLURM_NTASKS -b -dis -i ds.dat -o ds.out.$SLURM_JOB_ID

echo "############### ANSYS Mechanical finished ################"

Ferret (Apptainer)

Modules and program start

An installation of PyFerret is provided in /sw/container/jupyterhub/ferret.sif

The container can be used on Jupyter HPC for usage inside of notebooks:

%load_ext ferretmagic

%%ferret

cancel data/all
use levitus_climatology
show data

%%ferret -q

shade temp[k=1]; go land

It is also possible to connect to the HPC cluster with X11 forwarding enabled (ssh -X) and start ferret directly:

Note

Note that the $WORK directory currently differs between partitions, for which you can find details here and in the cpu partitions table. In this example the large96s:shared partition is used.

srun --partition large96s:shared -c 1 --x11 --pty bash  -c "module load apptainer squashfuse; apptainer run --bind $WORK,$TMPDIR /sw/container/jupyterhub/ferret.sif"

If you want to execute ferret with particular command line arguments you can also first start a shell inside the container:

srun --partition large96s:shared -c 1 --x11 --pty bash  -c "module load apptainer squashfuse; apptainer exec --bind $WORK,$TMPDIR /sw/container/jupyterhub/ferret.sif /bin/bash"

Documentation

Ferret is an interactive computer visualization and analysis environment for oceanographers and meteorologists analyzing large and complex gridded data sets. A detailed description is found at the Ferret home page, where an online-manual for the latest version can be found.

Specific features related to pyferret, i.e., new graphic capabilities and the integration of ferret in python are described on the pyferret home page.

Special features of the installation

pyferret

Usage of the pyferret-package with your own python - environment will be enabled after installation of a stable python - version. The wrapper startup-script ferret starts pyferret.

Special go-Files, additional data sets

All additional files are in the directory /sw/viz/ferret/ferret_iow.

Tool	scripts (use `go/help <script>` for help)
Median filter	median_l , median_test.jnl
Baltic Sea landmask	modified fland
Baltic Sea coastlines	land_balt_100, land_balt_200
West-African coastlines	land_south
comfortable aggregation	aggregate

The median filter is written by E. D. Cokelet. There is a test-script median_test.jnl.

For models of the Baltic Sea regional, topographic data sets are compiled and formatted like the etopo data, but with 1 n.m. oder 2 n.m. resolution. For this reason go fland 2 or go fland 1 works too for the Baltic Sea area.

Specific palettes

Palettes adcplus.spk, adcpmin.spk and adcp.spk are suitable for visualisation of current fields oscillating around 0 and are suitable for centered fill or shade levels.

Data set aggregation

Ferret allows to aggregate data distributed over many files into one logical data set.

Aggregation related manual section
Simplified aggregation over the time-dimension

For convenience, the go-script aggregate can be used.

Foam-extend

foam-extend is a community-backed fork of the CFD software OpenFOAM and is developped by Wikki.

The following versions of foam-extend are available on Emmy via the unified GWDG Modules.

OpenFOAM version	OpenFOAM module file	Requirements
v4.1	foam-extend/4.1	gcc/11.5.0 openmpi/4.1.7
v4.1-debug	foam-extend/4.1-debug	gcc/11.5.0 openmpi/4.1.7
v5.0	foam-extend/5.0	gcc/11.5.0 openmpi/4.1.7

Example Jobscripts

#!/bin/bash
#SBATCH -J TEST
#SBATCH -t 00:05:00
#SBATCH -N 2
#SBATCH --tasks-per-node 40
#SBATCH -p medium96s
 
#SBATCH -o OUTPUT
#SBATCH -e ERRORS
 
 
module load foam-extend/5.0
 
mpirun -np $SLURM_NTASKS rhoPimpleFoam -parallel -case ./ > run.log

Gaussian

License agreement

In order to use Gaussian you have to agree to the following conditions.

1. I am not a member of a research group developing software competitive to Gaussian.

2. I will not copy the Gaussian software, or make it available to anyone else.

3. I will properly acknowledge Gaussian Inc. in publications.

Please contact Support with a copy of the above statement and we will add you to the gaussian POSIX group which is required to be able to use the Gaussian installation.

Limitations

Only the latest version Gaussian 16 is currently supported on our software stack.

“Linda parallelism”, Cluster/network parallel execution of Gaussian, is not supported at any of our systems. Only “shared-memory multiprocessor parallel execution” is supported, therefore no Gaussian job can use more than a single compute node.

Description

Gaussian 16 is the latest in the Gaussian series of programs. It provides state-of-the-art capabilities for electronic structure modeling.

QuickStart

Environment modules

The following versions have been installed: Modules for running on CPUs

Version	Installation Path	modulefile
Gaussian 16 Rev. C.02	`/sw/chem/gaussian/16-C.02/`	gaussian/16-C.02

Modules for running on GPUs

Version	Installation Path	modulefile

Please contact support if you need access to a GPU version.

GPU Job Performance

GPUs are effective for DFT calculations, for both the ground and excited states for larger molecules. However, they are not effective for smaller jobs or for use in post-SCF calculations such as MP2 or CCSD.

Job submissions

Besides your Gaussian input file you have to prepare a job script to define the compute resources for the job; both input file and job script have to be in the same directory.

Default runtime files (.rwf, .inp, .d2e, .int, .skr files) will be saved only temporarily in $LOCAL_TMPDIR on the compute node to which the job was scheduled. The files will be removed by the scheduler when a job is done.

If you wish to restart your calculations when a job is finished (successful or not), please define the checkpoint (file_name.chk) file in your G16 input file (%Chk=route/route/name.chk).

CPU jobs

Because only the “shared-memory multiprocessor” parallel version is supported, your jobs can use only one node and up to 96 maximum cores per node.

CPU job script example

#!/bin/bash
#SBATCH --time=12:00:00                # expected run time (hh:mm:ss)
#SBATCH --partition=standard96         # Compute Nodes with installed local SSD storage
#SBATCH --mem=16G                      # memory, roughly 2 times %mem defined in the input name.com file
#SBATCH --cpus-per-task=16             # No. of CPUs, same amount as defined by %nprocs in the filename.com input file
 
module load gaussian/16-C.02
 
g16 filename.com                       # g16 command, input: filename.com

GPU jobs

Because only the “shared-memory multiprocessor” parallel version is supported, your jobs can use only one node up to 4 GPUs per node.

#!/bin/bash
#SBATCH --time=12:00:00                # expected run time (hh:mm:ss)
#SBATCH --partition=gpu-a100           # Compute Nodes with installed local SSD storage
#SBATCH --nodes=1                      # number of compute node
#SBATCH --mem=32G                      # memory, roughly 2 times %mem defined in the input name.com file
#SBATCH --ntasks=32                    # No.CPUs plus the number of control CPUs same amount as defined by %cpu in the filename.com input file
#SBATCH --gres=gpu:4                   # No. GPUs same amount as defined by %GPUCPU in the filename.com input file  
 
module load cuda/12.6
module load gaussian/16-C.02
 
g16 filename.com                       # g16 command, input: filename.com

Specifying GPUs & Control CPUs for a Gaussian Job

The %GPUCPU Link 0 command specifies which GPUs and associated controlling CPUs to use for a calculation. This command takes one parameter:

%GPUCPU=gpu-list=control-cpus

For example, for 2 GPUs, a job which uses 2 control CPU cores would use the following Link 0 commands:

%CPU=0-1 #Control CPUs are included in this list.
%GPUCPU=0,1=0,1

Using 4 GPUs and 4 control CPU cores:

%CPU=0-3 #Control CPUs are included in this list.
%GPUCPU=0,1,2,3=0,1,2,3

Using 4 GPUs and a total of 32 CPU cores including 4 control CPU cores :

%CPU=0-31 #Control CPUs are included in this list.
%GPUCPU=0,1,2,3=0,1,2,3

Interactive jobs

Example for CPU calculations:

~ $ salloc -t 00:10:00 -p standard96:ssd -N1 –tasks-per-node 24
~ $ g16 filename.com

Exmaple for GPU calculations:

~ $ salloc -t 00:10:00 -p gpu-a100 -N1 –ntasks=32
~ $ g16 filename.com

Restart calculations from checkpoint files

opt=restart

Molecular geometry optimization jobs can be restarted from a checkpoint file. All existing information; basis sets, wavefunction and molecular structures during the geometry optimization can be read from the checkpoint file.

%chk=filename.chk
%mem=16GB
%nprocs=16
# method chkbasis guess=read geom=allcheck opt=restart

#restart

The same restarting can be done for vibrational frequency computations.

%chk=filename.chk
%mem=16GB
%nprocs=16
# restart

Input file examples

Example for CPU calculations:

%nprocshared=8
%mem=2GB
# opt freq hf/3-21g

water

0 1
 O                 -1.41509432    0.66037740    0.00000000
 H                 -0.45509432    0.66037740    0.00000000
 H                 -1.73554890    1.56531323    0.00000000

Example for GPU calculations:

%mem=60GB
%CPU=0-1
%gpucpu=0,1=0,1
# opt apfd/6-31g(d) 

Title Card Required

0 1
 H                 -1.62005626   -0.35225540   -2.17827284
 O                 -0.69040026   -0.26020540   -1.95721784
 C                  0.05363374   -1.42984340   -1.56739684
 H                  0.03664274   -2.15867240   -2.37771784
 H                 -0.39674326   -1.86734240   -0.67641084
 C                  1.49463474   -1.05242940   -1.26495084
 H                  1.95172974   -1.82638340   -0.64837984
 O                  2.25960874   -1.07605340   -2.50827584
 C                  2.52702374    0.24992160   -2.94034084
 H                  3.60463674    0.40494860   -2.99342284
 N                  2.08334474    0.41467260   -4.39529984
 C                  0.78450474    0.65742260   -4.75722384
 H                  0.01937274    0.74075160   -3.99957584
 C                  0.44489674    0.84873160   -6.03927884
 C                 -0.97416626    1.11515060   -6.44582584
 H                 -1.32766726    2.02397360   -5.95881584
 H                 -1.02551326    1.23965460   -7.52747384
 H                 -1.60197226    0.27569760   -6.14703084
 C                  1.43961274    0.80465860   -7.07956584
 O                  1.22246674    0.96788160   -8.27919384
 N                  2.71853974    0.55507960   -6.62518984
 H                  3.50692174    0.51228860   -7.37666884
 C                  3.09832674    0.35430460   -5.31375884
 O                  4.25975474    0.13920960   -5.00843984
 C                  1.71110074    0.35738160   -0.71134984
 H                  0.83321974    0.66365860   -0.14247484
 C                  1.81887774    1.20664860   -1.97943084
 H                  1.03067374    1.95953060   -1.98119884
 H                  2.79097274    1.69901160   -2.00606484
 O                  2.89563774    0.56735060    0.04309416
 H                  3.00903374    1.45926360    0.37959516

Software Testing

The outputs of the full run of the gaussian testsuite can be found at $g16root/test-results/.

Likwid

LIKWID - “Like I Knew What I’m Doing”

LIKWID is developed by Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) for Performance Optimization, Modeling, and Architecture Analysis. It offers Command-line and Software Library interfaces. It supports architectures such as x86 and ARM, as well as NVIDIA and AMD GPUs.

There is extensive documentation in LIKWID’s Wiki

Quick Start

LIKIWID Toolset is available as a module, thus before using LIKWID a user need to load their preferred LIKWID version module to set the environment correctly.

(base) gwdu101:25 17:17:18 ~ > module show likwid
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  /opt/sw/modules/21.12/cascadelake/Core/likwid/5.2.0.lua:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
...

(base) gwdu101:25 17:17:24 ~ > 
(base) gwdu101:25 17:18:33 ~ > module load likwid
(base) gwdu101:25 17:18:43 ~ >

The following tasks can be performed by LIKWID:

Node architecture information

$ likwid-topology
$ likwid-powermeter

Examples of node-architecture for SCC’s amp016:

Affinity control and data placement

$ likwid-pin
$ likwid-mpirun

Query and alter system settings

$ likwid-features
$ likwid-setFrequencies

Application performance profiling (perf-counter)

Using the available hardware counters to measure events that characterise the interaction between software and hardware
Uses a light-weight marker API for code instrumentation

$ likwid-perfctr

Micro-benchmarking - Application and framework to enable:

Quantify sustainable performance limits
Separate influences considering isolated instruction code snippets
Reverse-egineer processor features
Discover hardware bugs

$ likwid-bench
$ likwid-memsweeper

likwid-topology

Thread topology: How processor IDs map on physical compute resources
Cache topology: How processors share the cache hierarchy
Cache properties: Detailed information about all cache levels
NUMA topology: NUMA domains and memory sizes
GPU topology: GPU information

likwid-pin

Explicitly supports pthread and the OpenMP implementations of Intel and GNU gcc
Only used with “pthread_create” API call which are dynamically linked with the static placement of threads.

likwid-perfctr

a lightweight command-line application to configure and read out hardware performance data
Can be used as a wrapper (no modification in the application) or by adding “Marker_API” functions inside the code
There are preconfigured performance groups with useful event sets and derived metrics
Since likwid-perfctr measures all events on the specified CPUs, it is necessary for processes and threads to dedicated resources.
This can be done by pinning the application manually or using the built-in functionality

Performance Groups

An outstanding feature of LIKWID
Organizes and combines micro-architecture events and counters with e.g. run-time and clock speed
Provides a set of derived metrics for efficient analysis
They are read on the fly without compilation by command-line selection
Are found in the path ${INSTALL_PREFIX}/share/likwid

Examples of using likwid-perfctr on SCC’s amp016 node

Use option -a to see available performance groups:
Use likwid-perfctr -g CLOCK to measure the clock speed.
Use likwid-perfctr -g FLOPS_DP to measure the Arithmetic Intensity in double precision.
Use likwid-perfctr -g MEM to measure the bandwidth of primary memory.

Marker API

Enables measurements of user-defined code regions.
The Marker API offers 6 functions (for C/C++) to measure named regions
Activated by “-DLIKWID_PERFORM” to compiler calls

LIKWID_MARKER_INIT //global initialization
LIKWID_MARKER_THREADINIT //individual thread initialization
LIKWID_MARKER_START("compute") //Start a code region named "compute"
LIKWID_MARKER_STOP("compute") //Stop the code region named "compute"
LIWKID_MARKER_SWITCH //Switches perfromance group or event set in a round-robin fashion
LIKWID_MARKER_CLOSE //global finalization

Mathematica

Mathematica is a universal interactive computer algebra application system with advanced graphical user interface.

License

On the HPC resources at GWDG there are four network licenses available for Mathematica.

Execution via HPC Desktop (Recommended)

One can start using Mathematica via the HPC-Deskop. Once in the desktop Environment, open a terminal and start a job with load Mathematica module with module load mathematica, launch with the binary math or mathematica. Then follow the instructions.

Execution via interactive SLURM job

First prepare the necessary environment with:

module load mathematic

The use of Mathematica then will be scheduled by the batch system in the interactive queue onto a free resource. Therefore, the following command to the batch system is necessary:

srun --x11 -c 24 -N 1 -p scc-cpu --pty bash

Note

The partition used depends on the account you have. Please check here to see the partitions that can be used

After a short time period you will get a Shell prompt, and can call Mathematica in the command line version with the command: math or in the X11 window version with the command mathematica.

The current version is printed as the first line of the output.

MATLAB

MATLAB is a universal interactive numerical application system with advanced graphical user interface. MathWorks offers many online beginners tutorials (onramps) to get you started with MATLAB here. This also includes onramps to different, more dedicated topics:

MATLAB Onramp
Simulink Onramp
Machine Learning Onramp
Deep Learning Onramp
Image Processing Onramp

License

On the HPC resources at GWDG there are 5 network licenses available for Matlab. Also we have the following extensions: Simulink, Optimization Toolbox, Parallel Computing Toolbox and Statistics and Machine Learning Toolbox.

For users from MPG we offer the flat rate License from MPG, which covers all toolboxes including MATLAB Parallel Server (formerly Distributed computing). However, you need to apply for the access to the license. You can do it by writing to hpc-support@gwdg.de. With this license you also have access to additional trainings, e.g.:

Command

First prepare the necessary environment with:

module load matlab/R2020b

The use of Matlab must be be scheduled by the batch system in the interactive queue onto an available node. Therefore the following command to the batch system is necessary:

srun --x11 -c 20 -N 1 -p int --pty bash

after a short time period you will get a Shell prompt and you can call Matlab with the command:

matlab

The current version can be found on the main matlab screen under ‘Help - About Matlab’.

Note

This example command does not set a time limit, so your job will have the default limit of 1 hour. After that time is up, your session will be killed by the batch system. Make sure to familiarize yourself with the srun command and its parameters and set a higher time limit by specifying the switch -t, see here.

Warning

Due to installation issues X11 forwarding via SLURM does not work correctly. Therefore use instead of above the following approach for the time being: You will need two SSH session simultenously:

Open a first SSH session on the login node:
- Submit an interactive job using from a first SSH session on the login node: srun --x11 -c 20 -N 1 -p int --pty bash
- Check the Hostname: hostname
Open a second SSH Session on the login node:
- SSH into the computenode: ssh -Y {node}
- Start matlab: matlab

Parallelization

The cluster currently has only Parallel Computing Toolbox for Matlab (without Matlab Distributed Computing Server), it means that the opportunities of parallelization are limited in one node. You can only use multiple processors of a single computing node.

Parallel Computing Toolbox provides following commands and structures for parallel programs:

parfor - parallel for loop
gpuArray - to work with GPU
parfeval
spmd
tall arrays

Info

MPG users can also use the MATLAB Parallel Server to parallelize your application across multiple nodes. Find all information necessary to get started here.

OpenFOAM

An object-oriented Computational Fluid Dynamics(CFD) toolkit

Description

OpenFOAM core is an extensible framework written in C++, i.e. it provides an abstraction for a programmer to build their own code for an underlying mathematical model.

Prerequisites

OpenFOAM is a free, open source software which is released under the GNU-GPL license

Modules

The following versions of OpenFOAM are available on Emmy via the unified GWDG Modules.

OpenFOAM version	OpenFOAM module file	Requirements
v6	openfoam-org/6	gcc/11.5.0 openmpi/4.1.7
v7	openfoam-org/7	gcc/11.5.0 openmpi/4.1.7
v8	openfoam-org/8	gcc/11.5.0 openmpi/4.1.7
v9	openfoam-org/9	gcc/11.5.0 openmpi/4.1.7
v9	openfoam-org/9	gcc/14.2.0 openmpi/4.1.7
v10	openfoam-org/10	gcc/11.5.0 openmpi/4.1.7
v11	openfoam-org/11	gcc/14.2.0 openmpi/4.1.7
v2306	openfoam/v2306	gcc/11.5.0 openmpi/4.1.7
v2312	openfoam/v2312	gcc/11.5.0 openmpi/4.1.7
v2406	openfoam/v2406	gcc/11.5.0 openmpi/4.1.7
v2406	openfoam/v2406	gcc/14.2.0 openmpi/4.1.7
v2412	openfoam/v2412	gcc/11.5.0 openmpi/4.1.7
v2412	openfoam/v2412	gcc/14.2.0 openmpi/4.1.7

Modules mentioned in the requirement column need to be loaded, starting from gcc, in order to successfully load the OpenFOAM package.

Example Jobscripts

The next examples are derived from https://develop.openfoam.com/committees/hpc/-/wikis/HPC-motorbike. It utilizes two full nodes and has collated file I/O. All required input/case files can be downloaded here: motorbike_with_parallel_slurm_script.tar.gz.

#!/bin/bash
#SBATCH --time 1:00:00
#SBATCH --nodes 1
#SBATCH --tasks-per-node 96
#SBATCH -p standard96:test
#SBATCH --job-name=test_job
#SBATCH --output=ol-%x.%j.out
#SBATCH --error=ol-%x.%j.err
  
export I_MPI_FALLBACK=0
export I_MPI_DEBUG=6
export I_MPI_FABRICS=shm:ofi
export I_MPI_OFI_PROVIDER=psm2
export I_MPI_PMI_LIBRARY=libpmi.so
  
module load openfoam-org/6
 
# initialize OpenFOAM environment
#---------------------
source $WM_PROJECT_DIR/etc/bashrc
source ${WM_PROJECT_DIR:?}/bin/tools/RunFunctions # provides fcts like runApplication
  
# set working directory
#---------------------
WORKDIR="$(pwd)"
 
# get and open example
#---------------------
cp -r $WM_PROJECT_DIR/tutorials/incompressible/icoFoam/cavity $WORKDIR/
cd cavity
  
# run script with several cases
#------------------------------
./Allrun
 
# run single case
#--------------------------
#cd cavity
#runApplication blockMesh
#icoFoam > icoFoam.log 2>&1

#!/bin/bash
#SBATCH --time 1:00:00
#SBATCH --nodes 2
#SBATCH --tasks-per-node 96
#SBATCH --partition standard96
#SBATCH --job-name foam_test_job
#SBATCH --output ol-%x.%j.out
#SBATCH --error ol-%x.%j.err
 
module load openfoam/v2306-source
 
. $WM_PROJECT_DIR/etc/bashrc                # initialize OpenFOAM environment
. $WM_PROJECT_DIR/bin/tools/RunFunctions    # source OpenFOAM helper functions (wrappers)
 
tasks_per_node=${SLURM_TASKS_PER_NODE%\(*}
ntasks=$(($tasks_per_node*$SLURM_JOB_NUM_NODES))
foamDictionary -entry "numberOfSubdomains" -set "$ntasks" system/decomposeParDict # number of geometry fractions after decompositon will be number of tasks provided by slurm
 
date "+%T"
runApplication blockMesh                    # create coarse master mesh (here one block)
date "+%T"
 
runApplication decomposePar                 # decompose coarse master mesh over processors
mv log.decomposePar log.decomposePar_v0
date "+%T"
 
runParallel snappyHexMesh -overwrite        # parallel: refine mesh for each processor (slow if large np) matching surface geometry (of the motorbike)
date "+%T"
 
runApplication reconstructParMesh -constant # reconstruct fine master mesh 1/2 (super slow if large np)
runApplication reconstructPar -constant     # reconstruct fine master mesh 2/2
date "+%T"
 
rm -fr processor*                           # delete decomposed coarse master mesh
cp -r 0.org 0                               # provide start field
date "+%T"
 
runApplication decomposePar                 # parallel: decompose fine master mesh and start field over processors
date "+%T"
 
runParallel potentialFoam                   # parallel: run potentialFoam
date "+%T"
 
runParallel simpleFoam                     # parallel: run simpleFoam
date "+%T"

Some important advice when running OpenFOAM on a supercomputer

Typically, OpenFOAM causes a lot of meta data operations. This default behavior jams no only your job but may slow down the shared parallel file system (=Lustre) for all other users. Also, your job is interrupted if the inode limit (number of files) of the quota system is exceeded.

If you can not use our local SSDs at $LOCAL_TMPDIR with #SBATCH --constraint ssd, please refer to our general advice to Optimize IO Performance.

To adapt/optimize your OpenFOAM job specifically for I/O operations on $WORK (=Lustre) we strongly recommend the following steps:

Always, to avoid that each processor writes in its own file please use collated file I/O. This feature was released 2017 for all OpenFOAM versions. [ESI www.openfoam.com/releases/openfoam-v1712/parallel.php] [Foundation www.openfoam.org/news/parallel-io]
```
OptimisationSwitches
{
    fileHandler collated; // all processors share a file
}
```
Always, set
```
runTimeModifiable false;
```
to reduce I/O activity. Only set “true” (default), if it is strictly necessary to re-read dictionaries (controlDict, …) each time step.
Possibly, do not save every time step: [www.openfoam.com/documentation/guides/latest/doc/guide-case-system-controldict.html] [www.cfd.direct/openfoam/user-guide/v6-controldict]
```
writeControl    timeStep;
writeInterval   100;
```
Possibly, save only the latest n time steps (overwrite older ones), such as:
```
purgeWrite  1000;
```

Typically, only a subset of variables is needed frequently (post-processing). The full set of variables can be saved less frequently (e.g., restart purposes). This can be achieved with [https://wiki.bwhpc.de/e/OpenFoam]:

writeControl    clockTime;
writeInterval   21600; // write ALL variables every 21600 seconds = 6 h

functions
{
    writeFields
    {
        type writeObjects;
        libs ("libutilityFunctionObjects.so");

        objects
        (
        T U // specified variables
        );

        outputControl timeStep;
        writeInterval 100; // write specified variables every 100 steps
    }
}

In case your job accidentally generated thousands of small files, please pack them (at least the small-size metadata files) into a single file afterwards:
```
tar -xvzf singlefile.tar.gz -C /folder/subfolder/location/
```

ParaView

An interactive data analysis and visualisation tool with 3D rendering capability

Warning

We provide ParaView in two different flavours. Directly on the cluster we only support pvserver. (see manual below). A GUI version can be used with the HPC Desktop. The respective module has the suffix “-gui”.

Description

ParaView is an open-source, multi-platform data analysis and visualization application with interactive 3D or programmatical batch processing capabilities

Read more on ParaView home page. For a manual visit the ParaView Guide page.

Modules

ParaView version	ParaView module file	Requirements	Island
5.11.2	paraview/5.11.2	gcc/11.5.0 openmpi/4.1.7	Emmy
5.13.2	paraview/5.13.2	gcc/14.2.0 openmpi/4.1.7	Emmy
5.13.2	paraview/5.13.2-gui	gcc/14.2.0 openmpi/4.1.7	Emmy
5.13.2	paraview/5.13.2	gcc/13.2.0 openmpi/5.0.7	Grete
5.13.2	paraview/5.13.2-gui	gcc/13.2.0 openmpi/5.0.7	Grete

Example Use

Tutorial: ParaView with gwdg-lmod

The appropriate login nodes for this phase are glogin-p2.hpc.gwdg.de.

On the cluster: Start interactive job:

srun --partition=standard96 --nodes=1 --ntasks-per-node=96 --pty bash

On the cluster: Load prerequisite modules

module load gcc/14.2.0
module load openmpi/4.1.7
module load paraview/5.13.2

On the cluster: Start ParaView-Server on your compute node gcn####
```
mpirun -n $SLURM_TASKS_PER_NODE pvserver
```
Wait a few seconds till your ParaView-Server provides a connection, typically with the port number 11111.
On your local computer: Start an ssh tunnel on your preferred terminal to access the port of the compute node via the respective login node of the NHR-NORD@Göttingen
```
ssh -N -L 11111:gcn####:11111 <user>@glogin-p2.hpc.gwdg.de
```
If this doesn’t work, you might need to try with a host jump option:
```
ssh -N -L 11111:localhost:11111 -J glogin-p2.hpc.gwdg.de -l <user> gcn####
```
Leave this terminal window open to keep the tunnel running. Before setting up an ssh tunnel, check if your standard ssh login works. If you are a windows user without a proper terminal we recommend MobaXterm.
: Start your ParaView client GUI and access your ParaView-Server at
```
localhost:11111
```

The appropriate login nodes for this phase are glogin-p3.hpc.gwdg.de.

On the cluster: Start interactive job:

srun --partition=standard96s --nodes=1 --ntasks-per-node=96 --pty bash

On the cluster: Load prerequisite modules

module load gcc/14.2.0
module load openmpi/4.1.7
module load paraview/5.13.2

On the cluster: Start ParaView-Server on your compute node c####
```
mpirun -n $SLURM_TASKS_PER_NODE pvserver
```
Wait a few seconds till your ParaView-Server provides a connection, typically with the port number 11111.
On your local computer: Start an ssh tunnel on your preferred terminal to access the port of the compute node via the respective login node of the NHR-NORD@Göttingen
```
ssh -N -L 11111:c####:11111 <user>@glogin-p3.hpc.gwdg.de
```
If this doesn’t work, you might need to try with a host jump option:
```
ssh -N -L 11111:localhost:11111 -J glogin-p3.hpc.gwdg.de -l <user> c####
```
Leave this terminal window open to keep the tunnel running. Before setting up an ssh tunnel, check if your standard ssh login works. If you are a windows user without a proper terminal we recommend MobaXterm.
: Start your ParaView client GUI and access your ParaView-Server at
```
localhost:11111
```

The appropriate login nodes for this phase are glogin-gpu.hpc.gwdg.de.

On the cluster: Start interactive job:

srun --partition=grete --nodes=1 --ntasks-per-node=96 --pty bash

On the cluster: Load prerequisite modules

module load gcc/14.2.0
module load openmpi/4.1.7
module load paraview/5.13.2

On the cluster: Start ParaView-Server on your compute node c####
```
mpirun -n $SLURM_TASKS_PER_NODE pvserver
```
Wait a few seconds till your ParaView-Server provides a connection, typically with the port number 11111.
On your local computer: Start an ssh tunnel on your preferred terminal to access the port of the compute node via the respective login node of the NHR-NORD@Göttingen
```
ssh -N -L 11111:c####:11111 <user>@glogin-gpu.hpc.gwdg.de
```
If this doesn’t work, you might need to try with a host jump option:
```
ssh -N -L 11111:localhost:11111 -J glogin-gpu.hpc.gwdg.de -l <user> c####
```
Leave this terminal window open to keep the tunnel running. Before setting up an ssh tunnel, check if your standard ssh login works. If you are a windows user without a proper terminal we recommend MobaXterm.
: Start your ParaView client GUI and access your ParaView-Server at
```
localhost:11111
```

Please note: the version of your local ParaView client has to be the same as the remote ParaView-Server. You can download various ParaView versions here. You can run module list to see the version of the loaded paraview module.

preCICE

Description

preCICE is an open-source software library for coupling different physics and simulation codes, particularly in the context of multiphysics simulations. It allows users to combine different simulation codes, such as finite element, finite volume, or lattice Boltzmann methods, to solve complex problems that involve multiple physical phenomena at different scales. Examples for available software that can be coupled on Emmy are OpenFOAM or foam-extend.

Preparation

preCICE is provided as a module and can be loaded via:

module load gcc/11.5.0 openmpi
module load precice

The coupling with different simulation codes is achieved by using dedicated adapters. Currently it is required to build the adapters on the cluster since they are not provided as loadable modules. Links to the instructions of the respective adapter can be found on the website of the preCICE project.

Tensorflow

TensorFlow is a open-source machine learning framework mainly developed by Google. It can be used for verious machine learning tasks, e.g. deep learning. TensorFlow provides a high level API in python and other languages and it can run on CPUs as well as GPUs.

Installing TensorFlow

It is recommended to use miniforge to create a virtual python environment and install the desired version of tensorflow within that environment.

module load miniforge3
conda create -n myenv python=3.12
source activate myenv
python3 -m pip install 'tensorflow[and-cuda]'

If you do not want to use GPUs simply use python3 -m pip install tensorflow-cpu

Testing the installation

To run TensorFlow on GPUs, load the correct modules and submit a job to the gpu partition.

#!/bin/bash
#SBATCH -p scc-gpu
#SBATCH -t 1
#SBATCH --gpus-per-node 1
 
module load miniforge3

source activate myenv
 
python tftest.py

import tensorflow as tf
print("TensorFlow version:", tf.__version__)

for x in tf.config.list_physical_devices('GPU'):
    print(tf.config.experimental.get_device_details(x))

And then submit the job using Slurm:

sbatch jobscript.sh

The output file (slurm-{jobid}.out ) should contain:

TensorFlow version: 2.19.0
{'compute_capability': (7, 0), 'device_name': 'Tesla V100S-PCIE-32GB'}

and also information about the GPUs selected.

Testing CPU only installation

If you want to test a CPU only installation, you can just run the tftest.py on a login node.

Using Tensorflow

You can now use TensorFlow in your python scripts. Please read gpu_selection for more information about GPU usage.

TURBOMOLE

TURBOMOLE is a computational chemistry program that implements various quantum chemistry methods (ab initio methods). It was initially developed at the University of Karlsruhe.

Description

TURBOMOLE features all standard methods as well as DFT code for molecules and solids, excited states and spectra using DFT or Coupled Cluster methods. Some of the programs can be used with MPI parallelisation.

Read more about it on the developer’s homepage.

An overview of the documentation can be found here.

The vendor also provides a list of utilities.

Prerequisites

Only members of the tmol user group can use the TURBOMOLE software. To have their user ID included in this group, users can send a message to their consultant or to NHR support.

Modules

Check the module listed under either Emmy Core modules or under the Grete Core modules.

Usage

Load the necessary modules. TURBOMOLE has two execution modes. By default it uses the SMP version (single node), but it can also run as MPI on multiple nodes on the cluster. To run the MPI version, the variable PARA_ARCH needs to be set to MPI. If it is empty, does not exist or set to SMP, it uses the SMP version.

Example for the MPI version

export PARA_ARCH=MPI
module load turbomole/7.8.1

TmoleX GUI

TmoleX is a GUI for TURBOMOLE that allows users to build a workflow. It also aids in the building of the initial structure and visualization of results.

To run the TmoleX GUI, you must connect using X11 forwarding (ssh -Y …).

module load turbomole/tmolex
TmoleX22

Alternatively, you can use our HPC Desktops via JupyterHub.

Job Script Examples

Note that some calculations run only in a certain execution mode; please consult the manual. Here all execution modes are listed.

Serial version. The calculations run serial and run only on one node.

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -N 1
#SBATCH --mem-per-cpu=1.5G
 
module load turbomole
 
jobex -ri -c 300 > result.out

SMP Version: It can only run on one node. Use one node and use all CPUs:

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -N 1
#SBATCH --cpus-per-task=96
 
export PARA_ARCH=SMP
module load turbomole
 
export PARNODES=$SLURM_CPUS_PER_TASK
 
jobex -ri -c 300 > result.out

MPI version. The MPI binaries have a _mpi suffix. To use the same binary names as the SMP version, the path will be extended to TURBODIR/mpirun_scripts/. This directory symlinks the binaries to the _mpi binaries. Here we run it on 8 nodes with all 96 cores:

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -N 8
#SBATCH --tasks-per-node=96
 
export SLURM_CPU_BIND=none  
export PARA_ARCH=MPI
module load turbomole
  
export PATH=$TURBODIR/mpirun_scripts/`sysname`/IMPI/bin:$TURBODIR/bin/`sysname`:$PATH
export PARNODES=${SLURM_NTASKS}
 
jobex -ri -c 300 > result.out

Open MP Version, here we need to set the OMP_NUM_THREADS variable. Again, it uses 8 nodes with 96 cores. We use the standard binaries with Open MP. Do not use the mpi binaries. If OMP_NUM_THREADS is set, then it uses the Open MP version.

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -N 8
#SBATCH --tasks-per-node=96
 
export SLURM_CPU_BIND=none  
export PARA_ARCH=MPI
module load turbomole
 
export PARNODES=${SLURM_NTASKS}
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
 
jobex -ri -c 300 > result.out

Compilers and Interpreters

Many different compilers, interpreters, and environments are provided, which are organized by programming language below. Additional compilers, interpreters, and environments for some these languages and other languages can be installed via Spack.

Note

Loading and unloading modules for C, C++, and Fortran compilers can change what other modules are visible in the all Lmod based software stacks (e.g. GWDG Modules (gwdg-lmod)). Specifically, modules for packages built-by/associated-with the compiler aren’t visible until the compiler module is loaded. For example, the intel-oneapi-mpi module is not visible in the GWDG Modules (gwdg-lmod) software stack until after the intel-oneapi-compilers module is loaded.

Note

Our module system refrains from using LD_LIBRARY_PATH (using it is risky) in favor of LD_RUN_PATH. This means the Intel compilers require using the flags -Wl,-rpath -Wl,$LD_RUN_PATH or the dynamic linker will not find the libraries. Without the flags, the compiler will throw errors such as this one:

error while loading shared libraries: libxxx.so.yyy: cannot open shared object file: No such file or directory

C

GCC (gcc)
AMD Optimizing Compilers
Intel Compilers (icc and icx)
LLVM (clang)
Nvidia HPC Compilers (nvc), successor to the PGI compilers

Info

Build tools might require that you set the environmental variable CC to the compiler program (e.g. export CC=icx) if you aren’t using GCC (gcc).

C++

GCC (g++)
AMD Optimizing Compilers
Intel Compilers (icc and icx)
LLVM (clang++)
Nvidia HPC Compilers (nvc++), successor to the PGI compilers

Info

Build tools might require that you set the environmental variable CXX to the compiler program (e.g. export CXX=icx) if you aren’t using GCC (g++).

Fortran

GCC (gfortran)
AMD Optimizing Compilers
Intel Compilers (ifort and ifx)
Nvidia HPC Compilers (nvfortran), successor to the PGI compilers

Info

Build tools might require that you set the environmental variables F77 and FC to the compiler program (e.g. export F77=ifx FC=ifx) if you aren’t using GCC (gfortran).

Go

Go (go), reference implementation

Julia

Julia (julia), reference implementation

MATLAB

MATLAB-Parallel-Server

Python

Python, reference implementation, package and project managers

R

R (r), reference implementation

Rust

rustc (rustc), reference implementation

AMD Optimizing Compilers (AOCC)

The AMD Optimizing Compilers (AOCC) are a customized build of LLVM meant for optimizing code for AMD Zen series processors. The compiler program names are the same as for LLVM since they are just a customized build. The compilers are intentionally quite compatible with the GCC compilers; supporting many of the same extensions beyond the official language standards, having similar compiler options, and often being able mix C++ code compiled by both.

Warning

While the compilers can compile code that might run on the very latest Intel CPUs, the code will likely perform poorly. We strongly recommend that you use a different set of compilers for nodes with Intel CPUs such as GCC or LLVM (Clang).

In all software stacks, the module name is aocc. To load a specific version, run

module load aocc/VERSION

To load the default version, run

module load aocc

Languages

The supported languages and the names of their compiler programs are in the table below.

Language	Compiler Program
C	`clang`
C++	`clang++`
Fortran	`flang`

OpenMP

Clang supports the OpenMP (Open Multi-Processing) extension for C, C++, and Fortran. Enable it by adding the -fopenmp option to the compiler. Additionally, the -fopenmp-simd option enables using the OpenMP simd directive.

Targeting CPU Architecture

By default, the Clang compilers will compile code targeting the Zen 1 (znver1). As all AMD processors in the cluster are at least Zen 2, this results in sub-optimal code. The compilers use the following options to control the target architecture:

Compiler Option	Default Value	Meaning
`-march=ARCH`	`znver1`	Generate instructions targeting `ARCH`
`-mtune=ARCH`	`znver1`	Tune performance for `ARCH` but don’t generate instructions not supported by `-march=ARCH`
`-mcpu=ARCH`		Alias for `-march=ARCH -mtune=ARCH`

The ARCH values for the different CPU architectures (Spack naming) we provide are

Architecture/Target (Spack naming)	`ARCH` value for the compilers
`zen2`	`znver2`
`zen3`	`znver3`
`skylake_avx512`	IMPOSSIBLE
`cascadelake`	IMPOSSIBLE
`sapphirerapids`	Not recommended, but if you must, use `znver1` and see warning below

Warning

Even the oldest AMD Zen processors (znver1) have instructions not supported by the sapphirerapids, but code compiled for znver1 might work on these recent Intel processors. But the code, if if runs, will perform poorly. We strongly recommend that you use a different set of compilers for nodes with AMD CPUs such as GCC or LLVM (Clang).

GCC

The GNU Compiler Collection (GCC) is a popular general-purpose collection of compilers for several different languages on many architectures. In all software stacks, the module name is gcc and there is often many versions. To load a specific version, run

module load gcc/VERSION

To load the default version, run

module load gcc

Languages

The supported languages and the names of their compiler programs are in the table below.

Language	Compiler Program
C	`gcc`
C++	`g++`
Fortran	`gfortran`

OpenMP

GCC supports the OpenMP (Open Multi-Processing) extension for C, C++, and Fortran. Enable it by adding the -fopenmp option to the compiler. Additionally, the -fopenmp-simd option enables using the OpenMP simd directive.

Targeting CPU Architecture

By default, the GCC compilers will compile code targeting the generic version of the CPU the compilers are run on. On an x86-64 node, this means being compatible with the original 64-bit AMD and Intel processors from 2003 and thus no AVX/AVX2/AVX-512 without SSE fallback. The compilers use the following options to control the target architecture:

Compiler Option	Default Value	Meaning
`-march=ARCH`	`generic`	Generate instructions targeting `ARCH`
`-mtune=ARCH`	`generic`	Tune performance for `ARCH` but don’t generate instructions not supported by `-march=ARCH`
`-mcpu=ARCH`		Alias for `-march=ARCH -mtune=ARCH`

The ARCH values for the different CPU architectures (Spack naming) we provide are

Architecture/Target (Spack naming)	`ARCH` value for GCC
Most generic version of node compiler is running on	`generic`
The CPU of the node the compiler is running on	`native`
`skylake_avx512`	`skylake-avx512`
`cascadelake`	`cascadelake` (use `skylake_avx512` for GCC 8)
`sapphirerapids`	`sapphirerapids` (use `icelake-client` for GCC 10 and older)
`zen2`	`znver2` (use `znver1` for GCC 8)
`zen3`	`znver3` (use `znver1` for GCC 8 and `znver2` for GCC 9)

Go

Go (go) is the reference and most popular Go compiler. The compiler program’s name is simply go. In all software stacks, the module name is go. To load a specific version, run

module load go/VERSION

To load the default version, run

module load go

Targeting CPU Architecture

By default, the go will compile code targeting some generic version of the CPU the compiler is run on. The compiler can be forced to compile for a different architecture by setting the GOARCH environmental variable like

export GOARCH=ARCH

At the moment, all CPU architectures in the clusters use the same value of GOARCH, which is amd64. This would change in the future if any other architectures (e.g. ARM or RISC-V) are added to the clusters.

Intel Compilers

The Intel Compilers are a set of compilers populare in HPC optmized for Intel processors. There are two families of the Intel Compilers, the classic compilers and the OneAPI compilers. The module name is not the same in all software stacks. The module names can be found in the table below.

Version	GWDG Modules (gwdg-lmod)	NHR Modules (nhr-lmod)	SCC Modules (scc-lmod)	HLRN Modules (hlrn-tmod)
OneAPI	`intel-oneapi-compilers`	`intel-oneapi-compilers`	`intel-oneapi-compilers`	`intel/2022` and newer
classic	`intel-oneapi-compilers-classic`		`intel`	`intel/19.x.y` and older

Info

Sometimes the internal LLVM components of the Intel OneAPI compilers like llvm-ar of llvm-profgen are required e.g. for Interprocedural Optimization (IPO). These can be loaded via the additional following command in the GWDG Modules (gwdg-lmod):

module load intel-oneapi-compilers-llvm

Info

The older software stacks NHR Modules (nhr-lmod) and SCC Modules (scc-lmod) also have the latest version of the classic compilers in their intel-oneapi-compilers module (these were removed in the 2024/2025 versions of OneAPI).

To load a specific version, run

module load NAME/VERSION

where NAME comes from the table above. To load the default version, run

module load NAME

Languages

The supported languages and the names of their compiler programs are in the table below.

Language	OneAPI Program	Classic Program
C	`icx`	`icc`
C++	`icpx`	`icpc`
Fortran	`ifx`	`ifort`

OpenMP

The Intel Compilers support the OpenMP (Open Multi-Processing) extension for C, C++, and Fortran. Enable it by adding the -qopenmp option to the compiler. Additionally, the -qopenmp-simd option enables using the OpenMP simd directive.

Targeting CPU Architecture

By default, the Intel Compilers will compile code targeting the generic version of the CPU the compilers are run on. On an x86-64 node, this means being compatible with the very earliest 64-bit Intel processors from 2003 and thus no AVX/AVX2/AVX-512 without SSE fallback. The compilers use the following options to control the target architecture:

Compiler Option	Default Value	Meaning
`-march=ARCH`	`generic`	Generate instructions targeting `ARCH`
`-mtune=ARCH`	`generic`	Tune performance for `ARCH` but don’t generate instructions not supported by `-march=ARCH`

The ARCH values for the different CPU architectures (Spack naming) we provide are

Architecture/Target (Spack naming)	`ARCH` value for Intel Compilers
Generic x86-64 Intel CPU	`off`
`skylake_avx512`	`skylake-avx512` (use `core-avx2` for classic before 19)
`cascadelake`	`cascadelake` (use `skylake-avx512` for classic 18 and `core-avx2` for classic before 18)
`sapphirerapids`	`sapphirerapids` (use `icelake-client` for classic 18) and `core-avx2` for classic before 18)
`zen2`	Not recommended, but if you must, use `core-avx2` and see warning below
`zen3`	Not recommended, but if you must, use `core-avx2` and see warning below

Warning

While the compilers can compile code that will run on AMD CPUs, the code performs poorly. We strongly recommend that you use a different set of compilers for nodes with AMD CPUs such as GCC or LLVM (Clang).

Julia

Julia (julia) is the reference and so far only interpreter/compiler/environment. The interpreter’s program’s name is simply julia. In all software stacks, the module name is julia. To load a specific version, run

module load julia/VERSION

To load the default version, run

module load julia

LLVM (Clang, etc.)

The LLVM is a collection of compiler and toolchain tools and libraries, which includes its own compiler Clang (clang and clang++) and is the foundation on which many other compilers are built. This page has information only on the compilers that are officially part of LLVM, not 3rd party compilers built on top of it. The Clang C and C++ compilers are intentionally quite compatible with the GCC compilers; supporting many of the same extensions beyond the official language standards, having similar compiler options, and often being able mix C++ code compiled by both. In all software stacks, the module name is llvm. To load a specific version, run

module load llvm/VERSION

To load the default version, run

module load llvm

Languages

The supported languages and the names of their compiler programs are in the table below.

Language	Compiler Program
C	`clang`
C++	`clang++`

OpenMP

Clang supports the OpenMP (Open Multi-Processing) extension for C and C++. Enable it by adding the -fopenmp option to the compiler. Additionally, the -fopenmp-simd option enables using the OpenMP simd directive.

Targeting CPU Architecture

By default, the Clang compilers will compile code targeting the generic version of the CPU the compilers are run on. On an x86-64 node, this means being compatible with the original 64-bit AMD and Intel processors from 2003 and thus no AVX/AVX2/AVX-512 without SSE fallback. The compilers use the following options to control the target architecture:

Compiler Option	Default Value	Meaning
`-march=ARCH`	`generic`	Generate instructions targeting `ARCH`
`-mtune=ARCH`	`generic`	Tune performance for `ARCH` but don’t generate instructions not supported by `-march=ARCH`
`-mcpu=ARCH`		Alias for `-march=ARCH -mtune=ARCH`

The ARCH values for the different CPU architectures (Spack naming) we provide are

Architecture/Target (Spack naming)	`ARCH` value for the compilers
Most generic version of node compiler is running on	`generic`
The CPU of the node the compiler is running on	`native` (not available before LLVM 16)
`skylake_avx512`	`skylake-avx512`
`cascadelake`	`cascadelake`
`sapphirerapids`	`sapphirerapids` (use `icelake-client` for LLVM 11 and older)
`zen2`	`znver2`
`zen3`	`znver3` (use `znver2` for LLVM 11 and older)

MATLAB PARALLEL SERVER ON THE SCC

This document provides the steps to configure MATLAB to submit jobs to a cluster, retrieve results, and debug errors.

Warning

The procedure in here only applies to MPG users. University of Göttingen users have NO access to the MATLAB Parallel Server.

INSTALLATION and CONFIGURATION – MATLAB client on the desktop

The SCC MATLAB support package can be found at here. Download the appropriate archive file and start MATLAB. The archive file should be untarred/unzipped in the location returned by calling

>> userpath

Configure MATLAB to run parallel jobs on your cluster by calling configCluster. configCluster only needs to be called once per version of MATLAB.

>> % Create new HPC profile
>> configCluster
Username on SCC (e.g. jdoe): USER-ID

	Must set WallTime before submitting jobs to SCC.  E.g.

	>> c = parcluster;
	>> % 5 hour, 30 minute walltime
	>> c.AdditionalProperties.WallTime = '05:30:00';
	>> c.saveProfile

	NOTICE: Connecting to SCC requires an SSH keypair.
	You will be prompted for your private key upon
	submitting your first job.

Complete. Default cluster profile set to "SCC R2022b"
>>

Jobs will now default to the cluster rather than submit to the local machine.

Note

If you would like to submit to the local machine instead then run the following command:

>> % Get a handle to the local resources
>> c = parcluster('local');

CONFIGURING JOBS

Prior to submitting the job, we can specify various parameters to pass to our jobs, such as queue, e-mail, walltime, etc. The following is a partial list of parameters. See “AdditionalProperties” for the complete list. None of these are required.

>> % Get a handle to the cluster
>> c = parcluster('SCC R2022b');


	Must set WallTime before submitting jobs to SCC.  E.g.

	>> c = parcluster;
	>> % 5 hour, 30 minute walltime
	>> c.AdditionalProperties.WallTime = '05:30:00';
	>> c.saveProfile

	NOTICE: Connecting to SCC requires an SSH keypair.
	You will be prompted for your private key upon
	submitting your first job.

Complete cluster profile setup.
Input path to $HOME on SCC (e.g. /user/first.last/u#####): /user/damian/pietrus/u12352
Configuration complete. 

>> % Specify the wall time (e.g., 5 hours)
>> c.AdditionalProperties.WallTime = '05:00:00';
>> c.AdditionalProperties.Partition = 'medium';
>> c.saveProfile

Save changes after modifying “AdditionalProperties” for the above changes to persist between MATLAB sessions.

>> c.saveProfile

To see the values of the current configuration options, display “AdditionalProperties”.

>> % To view current properties
>> c.AdditionalProperties

Unset a value when no longer needed.

>> % Turn off email notifications 
>> c.AdditionalProperties.EmailAddress = '';
>> c.saveProfile

The created profile can be found on the MATLAB Client:

INTERACTIVE JOBS - MATLAB client on the cluster

To run an interactive pool job on the cluster, continue to use “parpool” as you’ve done before.

>> % Get a handle to the cluster
>> c = parcluster;

>> % Open a pool of 64 workers on the cluster
>> pool = c.parpool(64);

Rather than running local on the local machine, the pool can now run across multiple nodes on the cluster.

>> % Run a parfor over 1000 iterations
>> parfor idx = 1:1000
      a(idx) = …
   end

Once we’re done with the pool, delete it.

>> % Delete the pool
>> pool.delete

INDEPENDENT BATCH JOB

Use the batch command to submit asynchronous jobs to the cluster. The batch command will return a job object which is used to access the output of the submitted job. See the MATLAB documentation for more help on batch.

>> % Get a handle to the cluster
>> c = parcluster("SCC R2022b");

>> % Submit job to query where MATLAB is running on the cluster
>> job = c.batch(@pwd, 1, {}, 'CurrentFolder','.');

>> % Query job for state
>> job.State

>> % If state is finished, fetch the results
>> job.fetchOutputs{:}

>> % Delete the job after results are no longer needed
>> job.delete

Note

In case this is the FIRST time you submit a job to the cluster the following window will appear to select the SSH key file for the connection:

To retrieve a list of currently running or completed jobs, call “parcluster” to retrieve the cluster object. The cluster object stores an array of jobs that were run, are running, or are queued to run. This allows us to fetch the results of completed jobs. Retrieve and view the list of jobs as shown below.

>> c = parcluster;
>> jobs = c.Jobs;

Once we’ve identified the job we want, we can retrieve the results as we’ve done previously. “fetchOutputs” is used to retrieve function output arguments; if calling batch with a script, use load instead. Data that has been written to files on the cluster needs be retrieved directly from the file system (e.g. via ftp). To view results of a previously completed job:

>> % Get a handle to the job with ID 2
>> job2 = c.Jobs(2);

Note

You can view a list of your jobs, as well as their IDs, using the above c.Jobs command.

>> % Fetch results for job with ID 2
>> job2.fetchOutputs{:}

PARALLEL BATCH JOB

Users can also submit parallel workflows with the batch command. Let’s use the following example for a parallel job, which is saved as “parallel_example.m”.

function [t, A] = parallel_example(iter)
 
if nargin==0
    iter = 8;
end
 
disp('Start sim')
 
t0 = tic;
parfor idx = 1:iter
    A(idx) = idx;
    pause(2)
    idx
end
t = toc(t0);
 
disp('Sim completed')
 
save RESULTS A
 
end

This time when we use the batch command, to run a parallel job, we’ll also specify a MATLAB Pool.

>> % Get a handle to the cluster
>> c = parcluster;

>> % Submit a batch pool job using 4 workers for 16 simulations
>> job = c.batch(@parallel_example, 1, {16}, 'Pool',4, ...
       'CurrentFolder','.');

>> % View current job status
>> job.State

>> % Fetch the results after a finished state is retrieved
>> job.fetchOutputs{:}
ans = 
	8.8872

The job ran in 8.89 seconds using four workers. Note that these jobs will always request N+1 CPU cores, since one worker is required to manage the batch job and pool of workers. For example, a job that needs eight workers will consume nine CPU cores.

We’ll run the same simulation but increase the Pool size. This time, to retrieve the results later, we’ll keep track of the job ID.

Note

For some applications, there will be a diminishing return when allocating too many workers, as the overhead may exceed computation time.

>> % Get a handle to the cluster
>> c = parcluster;

>> % Submit a batch pool job using 8 workers for 16 simulations
>> job = c.batch(@parallel_example, 1, {16}, 'Pool', 8, ...
       'CurrentFolder','.');

>> % Get the job ID
>> id = job.ID
id =
	4
>> % Clear job from workspace (as though we quit MATLAB)
>> clear job

Once we have a handle to the cluster, we’ll call the “findJob” method to search for the job with the specified job ID.

>> % Get a handle to the cluster
>> c = parcluster;


>> % Find the old job
>> job = c.findJob('ID', 4);

>> % Retrieve the state of the job
>> job.State
ans = 
finished
>> % Fetch the results
>> job.fetchOutputs{:};
ans = 
4.7270

The job now runs in 4.73 seconds using eight workers. Run code with different number of workers to determine the ideal number to use. Alternatively, to retrieve job results via a graphical user interface, use the Job Monitor (Parallel > Monitor Jobs).

DEBUGGING

If a serial job produces an error, call the “getDebugLog” method to view the error log file. When submitting independent jobs, with multiple tasks, specify the task number.

>> c.getDebugLog(job.Tasks(3))

For Pool jobs, only specify the job object.

>> c.getDebugLog(job)

When troubleshooting a job, the cluster admin may request the scheduler ID of the job. This can be derived by calling schedID

>> schedID(job)
ans = 
25539

HELPER FUNCTIONS

Function	Description	Desktop-only
clusterFeatures	List of scheduler features/constraints
clusterGpuCards	List of cluster GPU cards
clusterQueueNames	List of scheduler queue names
disableArchiving	Modify file archiving to resolve file mirroring issue	true
fixConnection	Reestablish cluster connection	true
willRun	Explain, why job is not running

TO LEARN MORE

To learn more about the MATLAB Parallel Computing Toolbox, check out these resources:

Nvidia HPC Compilers

The Nvidia HPC Compilers are the successors to the PGI compilers and have good CUDA support (see the official compiler documentation). In all software stacks, the module name is nvhpc as well as nvhpc-hpcx in the HLRN Modules (hlrn-tmod) software stack to get Nvidia HPC SDK OpenMPI that supports jobs across more than one node. To load a specific version, run

module load nvhpc/VERSION

To load the default version, run

module load nvhpc

Info

The nvhpc module (and similarly the nvhpc-hpcx module) either have CUDA builtin or load the respective cuda module, so you don’t need to load the cuda module separately. But if it didn’t load a cuda module, loading one would let you target a different CUDA version using the -gpu=cudaX.Y option to target CUDA X.Y.

Languages

The supported languages and the names of their compiler programs (and PGI compiler aliases) are in the table below.

Language	Compiler Program	PGI Compiler Alias
C	`nvc`	`pgcc`
C++	`nvc++`	`pgc++`
Fortran	`nvfortran`	`pgfortran`

OpenMP

The Nvidia HPC Compilers support the OpenMP (Open Multi-Processing) extension for C, C++, and Fortran. Enable it by passing the -mp or -mp KIND options to the compiler where KIND is multicore (default if no option given) for using CPU cores or gpu for GPU offloading on compatible GPUs (V100 and newer) with CPU fallback.

OpenACC

THe Nvidia HPC Compilers support the OpenACC (Open ACCelerators) extention for C, C++, and Fortran. Enable it by passing the -acc or -acc=KIND options to the compiler where KIND is gpu for GPU offloading (default if no option given) or multicore for using CPU cores. There are additional KIND as well as other options that can be used, which should be separated by commas. See the Nvidia HPC Compilers OpenACC page for more information.

Targeting Architecture

GPU

By default, the Nvidia HPC Compilers will compile the GPU parts of the code for the compute capability of the GPUs attached to the node the compilers are run on, or all compute capabilities if none are present (most frontend nodes). The former may mean that when the program is run on a compute node, the program won’t support the compute node’s GPUs (requires features the GPUs don’t provide) or will perform suboptimally (compiled for much lower capability). The latter takes more time to compile and makes the program bigger. The compilers use the -gpu=OPTION1,OPTION2,... option to control the target GPU architecture, where different options are separated by commas. The most important option is ccXY where XY is the compute capability. It can be specified more than once to support more than one compute capability. The compute capabilities for the different GPUs that are provided are listed on the Spack page (cuda_arch is the compute capability).

CPU

By default, the Nvidia HPC Compilers will compile code targeting the generic version of the CPU the compilers are run on. On an x86-64 node, this means being compatible with the original 64-bit AMD and Intel processors from 2003 and thus no AVX/AVX2/AVX-512 without SSE fallback. The compilers use the -tp ARCH options to control the target architecture. The ARCH values for the different CPU architectures (Spack naming) we provide are

Architecture/Target (Spack naming)	`ARCH` value
Most generic version of node compiler is running on	`px`
The CPU of the node the compiler is running on	`native` or `host`
`skylake_avx512`	`skylake`
`cascadelake`	`skylake`
`sapphirerapids`	`skylake`
`zen2`	`zen2`
`zen3`	`zen3`

Python

Python is a powerful programming and scripting language. Its standard library already covers a wide range of tools and facilities. It can be extended by many third-party software packages created for different purposes.

This page is organized into 4 sections, where the simplest use cases are documented at the top and the most complex ones at the bottom:

If you are new to Python in general, we recommend to start reading at the top and not skip over any sections.

Python Versions and Module Files

The software installed in the current revision 25.04 uses the python/3.11.9 module. If you want to use Python packages provided by these modules or extend them, you should use this module to provide your Python interpreter.

Standard Python (No Additional Packages)

To load the default Python interpreter and start executing Python scripts, please run the following commands:

module load gcc
module load python

It is then as simple as python ./script.py to execute your first Python script on the system.

Alternatively you can also make your Python script executable (chmod +x script.py) and add the following she-bang line as the first line of your script:

#!/usr/bin/env python

Then you can execute it directly as ./script.py after loading the Python interpreter. If you are inside a virtual environment (documented below), this shortcut will also work.

Python Virtual Environments

The Python ecosystem has a large problem with complex dependencies and version incompatibilities. To ensure that updating the dependencies of one software package does not break a different one, you should isolate them in so-called virtual environments. The simplest method is to use the built-in venv feature of Python.

Note

Many parts of Python packages are compiled, so the usual caution has to be taken when you install them: only install/compile your Python packages on the same hardware that are you planning to run it on. The machine-kind script of your current software revision (e.g. /sw/rev_profile/25.04/machine-kind) gives you a good indicator of the hardware type. Ideally its output should match on the node where you install software and where you run it.

Note

We do not recommend updating environments. It is safer to recreate them from scratch as soon as you have to use new or different packages.

Many packages offer a file with the list of required packages and versions, which you can use to set up a new environment. The same goes for packages you always need. Simply create a requirements file and use it to set up fresh environments every time you need to change something. Remember to delete the old environment.

Where to Store Your Environments

In the past we have recommended using the $WORK filesystem for NHR users to store their virtual environments for lack of better options. This is not a good idea since the underlying parallel filesystem is not designed to store many small files and perform well with the large amount of metadata this creates. Using a large number of inodes on this filesystem is discouraged.

Please check the following table to find a suitable location:

Storage Location	Notes
Home Directory `$HOME`	Space in the home directory is precious. Do not store any large (> 1GB) environments here!
Project Directory `$PROJECT`	Good choice for Python environments. This directory also allows you to share virtual environments with other members of your project!
WORK/SCRATCH	Generally a bad idea for Python environments because of their large number of inodes. But you can use apptainer to mitigate that and store your environment in a single `.sif` file.

Creating a Virtual Environment

You can create a new virtual environment called foo in the current directory:

module load gcc
module load python/3.11.9
python -m venv foo

Now you can activate it by executing source ./foo/bin/activate. This will also add a (foo) prefix to your shell prompt.

You can install some extra packages from the Python Package Index (PyPI):

(foo) $ pip install lmfit matplotlib seaborn

Using a Virtual Environment

To run some Python application that uses these modules, you can just activate the environment and then execute the script with the python command:

source ./foo/bin/activate
python myscript.py

When you are done manipulating an environment, you can run deactivate to leave it.

The uv Python Package and Project Manager

Please read and understand the above section Python Virtual Environments first! It contains valuable hints about the recommended usage of virtual environments on our system.

If you don’t want to interact with the software installed in the module system but just install your own Python packages and also pick your own version of the Python interpreter, you can use uv.

Creating a Python Project with uv

Here is an example that turns the current directory into a new Python project:

module load uv
uv python list # Lists all available Python versions. In our case the list included the version 3.12.11.
uv python install 3.12.11
uv python pin 3.12.11
uv init
uv add lmfit matplotlib seaborn

To run a script in this project, just switch to the project directory and run:

module load uv
uv run script.py

Please check out the official documentation for more information.

Using uv as a Replacement for venv/pip

You can also directly use uv as a replacement for venv and pip:

module load uv
uv venv
uv pip install lmfit matplotlib seaborn
uv run script.py

Please check out the official documentation for more information about the pip interface.

Running Python Tools with uvx

Many Python application are used as tools. This can be greatly simplified with the uvx command.

In this example we reformat a Python script using black:

uvx black script.py

Please check out the official documentation for more information about using tools.

Installing conda Packages with miniforge3

Please read and understand the above section Python Virtual Environments first! It contains valuable hints about the recommended usage of virtual environments on our system.

Note that the miniforge3 module has replaced miniconda and anaconda on our system, since it only uses the free conda-forge channel by default. Installing software with miniconda or anaconda (from the default channel) would require purchasing an expensive commercial license (this applies to academic research and commercial usage, class-room usage remains free).

Loading miniforge3 and Preparing an Environment

In order to use Python and Conda environments, load the module miniforge3 and create an environment.

Note

Please note that you might encounter a SafetyError because the HPC package manager spack will slightly increase the filesize of dynamic libraries installed into the miniforge3 installation directory when it modifies the SONAME and RPATH entries in the dynamic sections. This can be safely ignored.

This example loads the module, creates a new environment called myenv in the current directory, and activates it using the source command:

module load miniforge3
conda create --prefix ./myenv python=3.12
source activate ./myenv

Once this is done, you are able to use Python normally, as you would on a personal computer with commands such as:

conda install -y numpy scipy matplotlib
pip install pillow

Info

We recommend NOT to use conda init. This way, the .bashrc file in your home directory is not updated and the environment is not loaded at each login, reducing unnecessary load on the systems and making your login much faster. It will also prevent any future problems with incompatible conda versions being loaded on a new system.

If you explicitly want to initialize conda, you can use the following command instead after loading the miniforge3 module:

source $CONDASH

This will allow you to use shell integration features like conda activate.

Warning

Do not use the conda install command after you have installed a package using pip! This will create inconsistencies in the environment. Better install all conda packages first and only run pip install at the end.

If you do not need to install any packages from conda-forge, bioconda etc, we recommend to not use conda at all and instead switch to uv.

Loading the Environment in a Batch Script

To load an environment in your batch script you repeat the above steps for activation.

This includes the source command and the path:

module load miniforge3
source activate ./myenv
# OR
module load miniforge3
source $CONDASH
conda activate ./myenv

Make sure you are in the right directory, or better yet, use the absolute path.

Info

To find the absolute path from a path relative to the current working directory ., you can for example use:

realpath ./myenv

Using a Different Base Operating System

Sometimes you might want to use a different operating system as a base for your Python environments, in particular if you require a newer glibc version (e.g. version `GLIBC_2.29' not found). For this you can use an Apptainer container.

Info

The best solution for the glibc problem would actually be to recompile the software from source so that it links against the older glibc provided on our system. But this is not always possible.

We also provide a basic miniforge container in /sw/container/python that can be used to create such virtual environments outside of a container for testing purposes. This means that the system-level libraries are provided from within the container (which currently uses Ubuntu 24.04), but the Python packages are installed directly on the HPC storage system.

To use this container and create a new Conda environment with PyTorch you can run the following commands:

# We create the Conda environment in a user-specific sub-directory of $PROJECT.
# If your project uses a different convention, please adjust the paths accordingly.
mkdir -p $PROJECT/$USER
module load apptainer
apptainer run --nv --bind /mnt,/user,/projects \
	/sw/container/python/miniforge-latest.sif
conda create -p $PROJECT/$USER/pytorch-venv python=3.12 pip

conda activate $PROJECT/$USER/pytorch-venv
pip3 install torch torchvision torchaudio

exit

As mentioned above, these commands should ideally be run on a node with a hardware configuration similar to the one on which the Python code will run later.

You can then run a script (e.g. test.py) in the current working directory inside of the container and environment in the following manner:

module load apptainer
apptainer run --nv --bind /mnt,/user,/projects \
	/sw/container/python/miniforge-latest.sif bash -c  \
	'conda activate $PROJECT/$USER/pytorch-venv; python test.py'

For example, to check that the CUDA support is working correctly, you can run the following simple script in a compute job on a GPU node:

#!/usr/bin/env python
import torch

if torch.cuda.is_available():
    for i in range(torch.cuda.device_count()):
        print(torch.cuda.get_device_properties(i).name)

R

R is the reference and most popular R interpreter/environment. The interpreter’s program’s name is simply R. In all software stacks, the module name is r. To load a specific version, run

module load r/VERSION

To load the default version, run

module load r

rustc

rustc (rustc) is the reference and most popular Rust compiler. The compiler program’s name is simply rustc. In all software stacks, the module name is rust. To load a specific version, run

module load rust/VERSION

To load the default version, run

module load rust

Targeting CPU Architecture

By default, the rust will compile code targeting some generic version of the CPU the compiler is run on. On an x86-64 node, this means being compatible with the original 64-bit AMD and Intel processors from 2003 and thus no AVX/AVX2/AVX-512 without SSE fallback. The compiler uses the -C target-cpu=ARCH option to control the architecture.

If you use cargo, you can use the RUSTFLAGS environment variable instead:

export RUSTFLAGS="-C target-cpu=ARCH"

The ARCH values for the different CPU architectures (Spack naming) we provide are

Architecture/Target (Spack naming)	`ARCH` value for rustc
Most generic version of node compiler is running on	`generic`
The CPU of the node the compiler is running on	`native`
`haswell`	`haswell`
`broadwell`	`broadwell`
`skylake_avx512`	`skylake-avx512`
`cascadelake`	`cascadelake`
`sapphirerapids`	`sapphirerapids`
`zen2`	`znver2`
`zen3`	`znver3`

Parallelization

Many different parallelization frameworks/systems/stacks/etc. are provided, which are organized by name below. Additional ones and additional implementations can be installed via Spack.

Hybrid MPI + OpenMP

Often, Message Passing Interface (MPI) and OpenMP (Open Multi-Processing) are used together to make hybrid jobs, using MPI to parallelize between nodes and OpenMP within nodes. This means codes must be compiled with both, carefully launched with Slurm to set the number of tasks and cores per task correctly.

Example Code

Here is an example code using both:

#include <stdio.h>

#include <omp.h>
#include <mpi.h>


int main(int argc, char** argv)
{
    // Initialize the MPI environment
    MPI_Init(NULL, NULL);

    // Get the number of processes
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    // Get the rank of the process
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

    // Get the name of the processor
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    int name_len;
    MPI_Get_processor_name(processor_name, &name_len);

    int nthreads, tid;

    // Fork a team of threads giving them their own copies of variables
    #pragma omp parallel private(nthreads, tid)
    {

        // Obtain thread number
        tid = omp_get_thread_num();
        printf("Hello World from thread = %d, processor %s, rank %d out of %d processors\n", tid, processor_name, world_rank, world_size);

        // Only primary thread does this
        if (tid == 0)
        {
            nthreads = omp_get_num_threads();
            printf("Number of threads = %d\n", nthreads);
        }

    }  // All threads join primary thread

    // Finalize the MPI environment.
    MPI_Finalize();
}

Compilation

To compile it, load the compiler and MPI module and then essentially combine the MPI compiler wrappers (see MPI) with the OpenMP compiler options (see OpenMP). For the example above, you would do

module load gcc
module load openmpi
mpicc -fopenmp -o hybrid_hello_world.bin hybrid_hello_world.c

module load intel-oneapi-compilers
module load intel-oneapi-mpi
mpiicx -qopenmp -o hybrid_hello_world.bin hybrid_hello_world.c

module load intel-oneapi-compilers
module load openmpi
mpicc -qopenmp -o hybrid_hello_world.bin hybrid_hello_world.c

Batch Job

When submitting the batch job, you have to decide how separate MPI processes you want to run per node (tasks) and how many cores for each (usually the number on the node divided by the number of tasks per node). The best way to do this is to explicitly set

-N <nodes> for the number of nodes
--tasks-per-node=<tasks-per-node> for the number of separate MPI processes you want on each node
-c <cores-per-task> if you want to specify the number of cores per task (if you leave it out, it will evenly divide them)

and then run the code in the jobscript with mpirun, which will receive all the required information from Slurm (we do not recommend using srun). If we run the example above using two nodes where each node runs 2 tasks each that uses all cores (but not hypercores), one would use the following job script:

#!/bin/bash

#SBATCH --time=00:10:00
#SBATCH --nodes=2
#SBATCH --tasks-per-node=2
#SBATCH --partition=standard96:test

module load gcc
module load openmpi

export OMP_NUM_THREADS=$(( $SLURM_CPUS_ON_NODE / $SLURM_NTASKS_PER_NODE / 2 ))

mpirun ./hybrid_hello_world.bin

#!/bin/bash

#SBATCH --time=00:10:00
#SBATCH --nodes=2
#SBATCH --tasks-per-node=2
#SBATCH --partition=standard96:test

module load intel-oneapi-compilers
module load intel-oneapi-mpi

export OMP_NUM_THREADS=$(( $SLURM_CPUS_ON_NODE / $SLURM_NTASKS_PER_NODE / 2 ))

mpirun ./hybrid_hello_world.bin

#!/bin/bash

#SBATCH --time=00:10:00
#SBATCH --nodes=2
#SBATCH --tasks-per-node=2
#SBATCH --partition=standard96:test

module load intel-oneapi-compilers
module load openmpi

export OMP_NUM_THREADS=$(( $SLURM_CPUS_ON_NODE / $SLURM_NTASKS_PER_NODE / 2 ))

mpirun ./hybrid_hello_world.bin

MPI

Message Passing Interface (MPI) is a popular standardized API for parallel processing both within a node and across many nodes. When using MPI, each task in a slurm job runs its the program in its own separate process, which all communicate to each other by MPI (generally using an MPI library).

Warning

Because MPI parallelization is between separate processes; variables, threads, file handles, and various other pieces of state are NOT shared between the processes even on the same node, in contrast to OpenMP which runs over many threads in a single process. Nothing is shared except that which is explicitly sent and received via the MPI API.

However, it is possible to use MPI together with OpenMP to parallelize using multiple threads in the MPI processes to use the best of both frameworks (see Hybrid MPI + OpenMP for more information).

Implementations

Two MPI implementations are provided in the software stacks, OpenMPI and Intel MPI. Both of them have two variants. OpenMPI has the official variant and the Nvidia HPC SDK variant, which is more tuned and built for Nvidia GPUs, Nvidia NV-Link (used between the GPUs on some nodes), and Mellanox Infiniband. Intel MPI has the classic variant and the newer OneAPI variant. Their module names in each software stack are in the table below. Note that in the event that there are multiple versions of a module, it is important to specify which one by module load module/VERSION. For example, the impi module in the HLRN Modules (hlrn-tmod) software stack has 10 different versions, 9 of them the classic Intel MPI and 1 of them Intel OneAPI MPI.

Implementation	GWDG Modules name	NHR Modules name	SCC Modules name	HLRN Modules name
OpenMPI (official)	`openmpi` (CUDA support on Grete)	`openmpi` (CUDA support on Grete)	`openmpi`	`openmpi`
OpenMPI (Nvidia HPC SDK)	`nvhpc`	`nvhpc`	`nvhpc`	`nvhpc-hpcx`
Intel MPI (OneAPI)	`intel-oneapi-mpi`	`intel-oneapi-mpi`	`intel-oneapi-mpi`	`impi/2021.6` and newer
Intel MPI (classic)			`intel-mpi` or `intel/mpi`	`impi/2019.9` and older

Warning

Do not mix up OpenMPI (an implementation of MPI) and OpenMP, which is a completely separate parallelization technology (it is even possible to use both at the same time). They just coincidentally have names that are really similar.

Compiling MPI Code

All MPI implementations work similarly for compiling code. You first load the module for the compilers you want to use and then the module for the MPI implementation (see table above).

load MPI:

For a specific version, run

module load openmpi/VERSION

and for the default version, run

module load openmpi

Substitute nvhpc with nvhpc-hpcx for the HLRN Modules (hlrn-tmod) software stack.

For a specific version, run

module load nvhpc/VERSION

and for the default version, run

module load nvhpc

Substitute intel-oneapi-mpi/VERSION with impi/2021.6 for the HLRN Modules (hlrn-tmod) software stack.

For a specific version, run

module load intel-oneapi-mpi/VERSION

and for the default version (not available for the HLRN Modules (hlrn-tmod) software stack), run

module load intel-oneapi-mpi

For a specific version (2019.9 or older), run

module load ipmi/VERSION

and for the default version, run

module load ipmi

In the rev/11.06 revision, substitute intel-mpi with intel/mpi

module load intel-mpi/VERSION

and for the default version, run

module load intel-mpi

The MPI modules provide compiler wrappers that wrap around the C, C++, and Fortran compilers to setup the compiler and linking options that the MPI library needs. As a general rule, it is best to use these wrappers for compiling. One major exception is if you are using HDF5 or NetCDF, which provide their own compiler wrappers which will wrap over the MPI compiler wrappers. If the code uses build system and is MPI naïve, you might have to manually set environmental variables to make the build system use the wrappers. The compiler wrappers and the environmental variables you might have to set are given in the table below:

Language	Wrapper (for GCC on Intel MPI)	Intel OneAPI Compiler wrappers on Intel MPI	Intel Classic Compiler wrappers on Intel MPI	Env. variable you might have to set
C	`mpicc`	`mpiicx`	`mpiicc`	`CC`
C++	`mpicxx`	`mpiicpx`	`mpiicpc`	`CXX`
Fortran (modern)	`mpifort` or `mpifc`	`mpiifx`	`mpiifort`	`FC`
Fortran (legacy)	`mpif77`	`mpiifx`	`mpiifort`	`F77`

Warning

Intel MPI provides wrappers for GCC, Intel OneAPI Compilers, and Intel Classic Compilers with different names. The default MPI wrappers are for GCC (mpicc, mpicxx, mpifort, mpifc, and mpif77). You must take the name of the Intel Compiler’s executable you want to use and prefix it with an “mpi” (e.g. ifx becomes mpiifx).

Note that in the current gwdg-lmod software stack, the Intel OneAPI Compilers and Intel Classic Compilers are in separate modules which are intel-oneapi-compilers and intel-oneapi-compilers-classic respectively. This also means that they have separate compiled software packages that become visible after loading the compiler to avoid any compatibility problems with code compiled by the two different compiler suites. In the older nhr-lmod and scc-lmod software stacks, the intel-oneapi-compilers module contains both.

MPI naïve build systems can usually be convinced to use the MPI compiler wrappers like

passing wrappers to build system:

CC=mpicc CXX=mpicxx FC=mpifort F77=mpif77 BUILD_SYSTEM_COMMAND [OPTIONS]

CC=mpicc CXX=mpicxx FC=mpifort F77=mpif77 cmake [OPTIONS]

CC=mpicc CXX=mpicxx FC=mpifort F77=mpif77 ./configure [OPTIONS]

Running MPI Programs

All MPI implementations work similarly for running in Slurm jobs, though they have vastly different extra options and environmental variables to tune their behavior. Each provides a launcher program mpirun to help run an MPI program. Both OpenMPI and Intel MPI read the environmental variables that Slurm set and communicate with Slurm via PMI or PMIx in order set themselves up with the right processes on the right nodes and cores.

First load the module for the MPI implementation you are using:

load MPI:

For a specific version, run

module load openmpi/VERSION

and for the default version, run

module load openmpi

Substitute nvhpc with nvhpc-hpcx for the HLRN Modules (hlrn-tmod) software stack.

For a specific version, run

module load nvhpc/VERSION

and for the default version, run

module load nvhpc

Substitute intel-oneapi-mpi/VERSION with impi/2021.6 for the HLRN Modules (hlrn-tmod) software stack.

For a specific version, run

module load intel-oneapi-mpi/VERSION

and for the default version (not available for the HLRN Modules (hlrn-tmod) software stack), run

module load intel-oneapi-mpi

For a specific version (2019.9 or older), run

module load ipmi/VERSION

and for the default version, run

module load ipmi

In the rev/11.06 revision, substitute intel-mpi with intel/mpi

module load intel-mpi/VERSION

and for the default version, run

module load intel-mpi

Then, run your program using the launcher your MPI implementation provided mpirun like so:

mpirun [MPI_OPTIONS] PROGRAM [OPTIONS]

where PROGRAM is the program you want to run, OPTIONS are the options for PROGRAM, and MPI_OPTIONS are options controlling MPI behavior (these are specific to each implementation).

Info

In some cases, it can make sense to use Slurm’s srun as the launcher instead of mpirun in batch jobs. Examples would include when you want to use only a subset of the tasks instead of all of them. Historically, there have been many bugs when launching MPI programs this way, so it is best avoided unless needed.

OpenMPI

OpenMPI is a widely uses MPI library with good performance over shared memory and all fabrics present on the clusters. There are two variants, the offical variant from OpenMPI and the one from the Nvidia HPC SDK. The Nvidia HPC SDK variant is always built and optimized to support Nvidia GPUs, Nvidia NV-Link (used between the GPUs on some nodes), and Mellanox Infiniband. The offial variant is built to support Nvidia GPUs in the GWDG Modules (gwdg-lmod) and NHR Modules (nhr-lmod) software stacks on the Grete nodes.

Warning

In all software stacks, the official variant’s module name is openmpi. The Nvidia HPC SDK variant’s module name is nvhpc.

To load OpenMPI, follow the instructions below.

Load OpenMPI:

For a specific version, run

module load openmpi/VERSION

and for the default version, run

module load openmpi

Some software might need extra help to find the OpenMPI installation in a non-standard location even after you have loaded the module. For example the python framework Neuron:

export MPI_LIB_NRN_PATH="${OPENMPI_MODULE_INSTALL_PREFIX}/lib/libmpi.so"

If the software does not have any method to specify the location of the MPI installation and cannot find libmpi.so, you can use LD_LIBRARY_PATH:

export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${OPENMPI_MODULE_INSTALL_PREFIX}/lib"

Please set this variable only when absolutely necessary to avoid breaking the linkage of other applications.

For a specific version, run

module load nvhpc/VERSION

and for the default version, run

module load nvhpc

Intel MPI

Intel MPI is a widely uses MPI library with good performance over shared memory and many fabrics. Note that contrary to what might be expected by its name, Intel MPI performs quite well on AMD processors.

Warning

There are two version families, classic Intel MPI (sometimes called just “Intel MPI” or “IMPI”) and its successor Intel OneAPI MPI. Unless you need the classic family for compatibility reasons, we recommend using the OneAPI family.

Load Intel MPI:

First, you have to load the intel-oneapi-compilers module (or intel-oneapi-compilers-classic if you want the Intel Classic Compilers) before the intel-oneapi-mpi module becomes visible by running

module load intel-oneapi-compilers

For a specific version, run

module load intel-oneapi-mpi/VERSION

and for the default version, run

module load intel-oneapi-mpi

First, you have to load the intel-oneapi-compilers module before the intel-oneapi-mpi module becomes visible by running

module load intel-oneapi-compilers

For a specific version, run

module load intel-oneapi-mpi/VERSION

and for the default version, run

module load intel-oneapi-mpi

For a specific version, run

module load intel-oneapi-mpi/VERSION

and for the default version, run

module load intel-oneapi-mpi

There is only one version available and the default version loads the Classic Intel MPI. To load it, run

module load impi/2021.6

For a specific version (2019.9 or older), run

module load ipmi/VERSION

and for the default version

module load ipmi

In the rev/11.06 revision, substitute intel-mpi with intel/mpi

module load intel-mpi/VERSION

and for the default version

module load intel-mpi

OpenACC

OpenACC (Open ACCelerators) is a standardized compiler extension to C, C++, and Fortran to offload compute tasks to various accelerators (GPUs, etc.) to allow parallelization in heterogeneous systems (e.g. CPU + GPU). It is deliberately similar to OpenMP. Note that in contrast to MPI, OpenACC is only for parallelization within a node, NOT between nodes.

The following compilers that support OpenACC for C, C++, and/or Fortran and the compiler options to enable it are given in the table below.

Compiler	Option	CPU Offload Option	GPU Offload Option
Nvidia HPC Compilers, successor to the PGI compilers	`-acc`	`-acc=multicore`	`-acc=gpu`

OpenCL

Open Computing Language (OpenCL) is a popular API for running compute tasks to a variety of devices (other CPUs, GPUs, FPGAs, etc.) in a standardized way. OpenCL does heterogeneous parallelization where one process runs tasks on other devices (or even threads in the same process depending on the platform).

Loader

Unlike other parallelization APIs, OpenCL in principle lets one load and use devices from multiple platforms simultaneously. It does this using using an ICD loader which looks for ICD files provided by platforms telling the loader how to load the platform.

Warning

Due to limitations in how the GWDG Modules and NHR Modules software stacks are built, only one platform can be used at a time.

Rather than relying on a loader from the OS, the ocl-icd loader is provided in the software stack itself as a module. It is needed to use any of the platforms, compile OpenCL code, check available devices, etc. It is loaded by

load OpenCL Loader:

For a specific version, run

module load ocl-icd/VERSION

and for the default version, run

module load ocl-icd

For a specific version, run

module load ocl-icd/VERSION

and for the default version, run

module load ocl-icd

Compiling OpenCL Code

While it is possible to build OpenCL code against a single platform, it is generally best to compile it against the ocl-icd loader so that it is easy to change the particular platform used at runtime. Simply load the module for it as in the above section and the OpenCL headers and library become available. The path to the headers directory is automatically added to the INCLUDE, C_INCLUDE_PATH, CPLUS_INCLUDE_PATH, and CPATH environmental variables used by some C/C++ compilers. If you need to pass the path manually for some reason, they are in the $OPENCL_C_HEADERS_MODULE_INSTALL_PREFIX/include directory. The path to the directory containing the libOpenCL.so library is automatically added to the LD_RUN_PATH environmental variable (you might have to add the -Wl,-rpath -Wl,$LD_RUN_PATH argument to your C/C++ compiler if it can’t find it).

Platforms

The available OpenCL platforms are listed in the table below. Simply load the module for a platform to use it, or to use the Nvidia driver platform have no other platforms loaded (you don’t even have to load the cuda module). Note that PoCL is provided in two variants, a CPU only variant and a CPU + Nvidia GPU variant.

Platform	Devices	NHR Modules name
Nvidia driver	Nvidia GPU	no other platform loaded
PoCL	CPU	`pocl`
	CPU, Nvidia GPU	`pocl/VERSION_cuda-CUDAMAJORVERSION`

Checking OpenCL Devices

You can use clinfo to walk through the OpenCL platforms that ocl-icd can find which then walks through their devices printing information about each one it finds.

It is loaded by

load clinfo:

For a specific version, run

module load clinfo/VERSION

and for the default version, run

module load clinfo

For a specific version, run

module load clinfo/VERSION

and for the default version, run

module load clinfo

And then you can just run it as

clinfo

Quick Benchmark to Check Performance

One of the major choices in OpenCL codes is what vector size to use for each type (integers, float, double, half, etc.), which can vary from platform to platform and from device to device. You can get what the vendor/platform thinks is the best size from clinfo (the “preferred size”). But in many cases, one must resort to empirical testing. While it is best to test it with the actual code to be used, you can get a crude guess using clpeak which runs quick benchmarks on the different vector sizes. It also benchmarks other important things like transfer bandwidth, kernel latency, etc.

It is loaded by

load clinfo:

For a specific version, run

module load clpeak/VERSION

and for the default version, run

module load clpeak

For a specific version, run

module load clpeak/VERSION

and for the default version, run

module load clpeak

And then to benchmark all platforms and devices it can find, run it as

clpeak

or for a specific platform and device

clpeak -p PLATFORM -d DEVICE

where you have gotten the PLATFORM and DEVICE numbers from clinfo.

Example jobs to benchmark the platforms and their results are

opencl benchmarks:

A job to benchmark of PoCL on the CPUs of an Emmy Phase 2 node is

#!/usr/bin/env bash

#SBATCH --job-name=clpeak-emmyp2
#SBATCH -p standard96:el8
#SBATCH -t 00:15:00
#SBATCH -N 1
#SBATCH -n 1

module load clpeak
module load pocl

clpeak

and the result is

Platform: Portable Computing Language
  Device: cpu-cascadelake-Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz
    Driver version  : 5.0 (Linux x64)
    Compute units   : 192
    Clock frequency : 3800 MHz

    Global memory bandwidth (GBPS)
      float   : 5.06
      float2  : 10.69
      float4  : 18.82
      float8  : 28.35
      float16 : 30.73

    Single-precision compute (GFLOPS)
      float   : 106.05
      float2  : 215.18
      float4  : 430.83
      float8  : 854.67
      float16 : 1660.30

    Half-precision compute (GFLOPS)
      half   : 27.61
      half2  : 48.37
      half4  : 93.31
      half8  : 185.87
      half16 : 344.81

    Double-precision compute (GFLOPS)
      double   : 109.07
      double2  : 208.74
      double4  : 429.89
      double8  : 822.74
      double16 : 1438.83

    Integer compute (GIOPS)
      int   : 210.39
      int2  : 165.35
      int4  : 327.13
      int8  : 629.76
      int16 : 1107.31

    Integer compute Fast 24bit (GIOPS)
      int   : 216.43
      int2  : 165.50
      int4  : 320.48
      int8  : 634.08
      int16 : 1101.22

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 4.80
      enqueueReadBuffer               : 4.64
      enqueueWriteBuffer non-blocking : 4.77
      enqueueReadBuffer non-blocking  : 5.12
      enqueueMapBuffer(for read)      : 4234.00
        memcpy from mapped ptr        : 4.57
      enqueueUnmap(after write)       : 9485.35
        memcpy to mapped ptr          : 5.19

    Kernel launch latency : 797.62 us

from which we could conclude that a vector size of 16 is a good initial guess for the optimum vector size. Also notice how poor half-precision performance is in comparison to single and double precision. For doing half precision computation, it is generally better to do it on the GPUs (like Grete) or on CPUs with better builtin half precision support (like Emmy Phase 3)

A job to benchmark of PoCL on the CPUs of an Emmy Phase 3 node is

#!/usr/bin/env bash

#SBATCH --job-name=clpeak-emmyp3
#SBATCH -p medium96s
#SBATCH -t 00:15:00
#SBATCH -N 1
#SBATCH -n 1

module load clpeak
module load pocl

clpeak

and the result is

Platform: Portable Computing Language
  Device: cpu-sapphirerapids-Intel(R) Xeon(R) Platinum 8468
    Driver version  : 5.0 (Linux x64)
    Compute units   : 192
    Clock frequency : 3800 MHz

    Global memory bandwidth (GBPS)
      float   : 72.84
      float2  : 85.54
      float4  : 88.44
      float8  : 97.76
      float16 : 105.01

    Single-precision compute (GFLOPS)
      float   : 126.41
      float2  : 254.00
      float4  : 512.32
      float8  : 1028.37
      float16 : 1839.51

    Half-precision compute (GFLOPS)
      half   : 106.61
      half2  : 217.59
      half4  : 449.15
      half8  : 878.24
      half16 : 1842.21

    Double-precision compute (GFLOPS)
      double   : 125.22
      double2  : 249.81
      double4  : 502.12
      double8  : 881.17
      double16 : 1515.57

    Integer compute (GIOPS)
      int   : 247.44
      int2  : 167.32
      int4  : 333.63
      int8  : 687.28
      int16 : 1229.16

    Integer compute Fast 24bit (GIOPS)
      int   : 250.40
      int2  : 167.86
      int4  : 340.70
      int8  : 679.93
      int16 : 1228.72

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 9.89
      enqueueReadBuffer               : 9.83
      enqueueWriteBuffer non-blocking : 8.45
      enqueueReadBuffer non-blocking  : 8.76
      enqueueMapBuffer(for read)      : 973.92
        memcpy from mapped ptr        : 8.85
      enqueueUnmap(after write)       : 2446.44
        memcpy to mapped ptr          : 9.91

    Kernel launch latency : 822.36 us

from which we could conclude that a vector size of 16 is a good initial guess for the optimum vector size.

A job to benchmark of the Nvidia driver on the Nvidia GPUs and PoCL on the CPUs and Nvidia GPUs of a Grete node is

#!/usr/bin/env bash

#SBATCH --job-name=clpeak-grete
#SBATCH -p grete
#SBATCH -t 00:15:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -G A100:4

module load clpeak

# Just Nvidia driver

echo '################################################################################'
echo '#'
echo '# Platform: Nvidia driver'
echo '#'
echo '################################################################################'
echo ''
echo ''

clpeak -p 0 -d 0

# PoCL

module load pocl/5.0_cuda-11

echo ''
echo ''
echo ''
echo ''
echo ''
echo '################################################################################'
echo '#'
echo '# Platform: PoCL - CUDA'
echo '#'
echo '################################################################################'
echo ''
echo ''

POCL_DEVICES=cuda clpeak -p 0 -d 0

echo ''
echo ''
echo ''
echo ''
echo ''
echo '################################################################################'
echo '#'
echo '# Platform: PoCL - CPU'
echo '#'
echo '################################################################################'
echo ''
echo ''

POCL_DEVICES=cpu clpeak

and the result is

################################################################################
#
# Platform: Nvidia driver
#
################################################################################



Platform: NVIDIA CUDA
  Device: NVIDIA A100-SXM4-40GB
    Driver version  : 535.104.12 (Linux x64)
    Compute units   : 108
    Clock frequency : 1410 MHz

    Global memory bandwidth (GBPS)
      float   : 1305.59
      float2  : 1377.32
      float4  : 1419.30
      float8  : 1443.96
      float16 : 1464.56

    Single-precision compute (GFLOPS)
      float   : 19352.94
      float2  : 19386.81
      float4  : 19351.17
      float8  : 19274.51
      float16 : 19104.15

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
      double   : 9721.90
      double2  : 9706.06
      double4  : 9681.32
      double8  : 9615.65
      double16 : 9533.18

    Integer compute (GIOPS)
      int   : 19276.27
      int2  : 19318.02
      int4  : 19260.69
      int8  : 19341.64
      int16 : 19333.69

    Integer compute Fast 24bit (GIOPS)
      int   : 19302.15
      int2  : 19297.12
      int4  : 19294.55
      int8  : 19217.17
      int16 : 19033.39

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 14.48
      enqueueReadBuffer               : 12.99
      enqueueWriteBuffer non-blocking : 14.16
      enqueueReadBuffer non-blocking  : 12.76
      enqueueMapBuffer(for read)      : 20.25
        memcpy from mapped ptr        : 19.63
      enqueueUnmap(after write)       : 26.76
        memcpy to mapped ptr          : 20.63

    Kernel launch latency : 9.07 us






################################################################################
#
# Platform: PoCL - CUDA
#
################################################################################



Platform: Portable Computing Language
  Device: NVIDIA A100-SXM4-40GB
    Driver version  : 5.0 (Linux x64)
    Compute units   : 108
    Clock frequency : 1410 MHz

    Global memory bandwidth (GBPS)
      float   : 1301.55
      float2  : 1368.14
      float4  : 1405.72
      float8  : 1438.37
      float16 : 1459.01

    Single-precision compute (GFLOPS)
      float   : 19369.37
      float2  : 19358.33
      float4  : 19357.20
      float8  : 19278.51
      float16 : 19135.89

    Half-precision compute (GFLOPS)
      half   : 19368.83
      half2  : 73221.23
      half4  : 66732.34
      half8  : 60351.88
      half16 : 62031.69

    Double-precision compute (GFLOPS)
      double   : 9700.11
      double2  : 9687.95
      double4  : 9675.77
      double8  : 9644.12
      double16 : 9565.58

    Integer compute (GIOPS)
      int   : 12937.51
      int2  : 12943.77
      int4  : 13225.68
      int8  : 12975.10
      int16 : 13058.68

    Integer compute Fast 24bit (GIOPS)
      int   : 12937.18
      int2  : 12943.10
      int4  : 13225.43
      int8  : 12975.01
      int16 : 13032.36

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 20.21
      enqueueReadBuffer               : 20.04
      enqueueWriteBuffer non-blocking : 20.22
      enqueueReadBuffer non-blocking  : 20.01
      enqueueMapBuffer(for read)      : 142689.94
        memcpy from mapped ptr        : 20.62
      enqueueUnmap(after write)       : 16.03
        memcpy to mapped ptr          : 20.65

    Kernel launch latency : -83.58 us






################################################################################
#
# Platform: PoCL - CPU
#
################################################################################



Platform: Portable Computing Language
  Device: cpu-znver3-AMD EPYC 7513 32-Core Processor
    Driver version  : 5.0 (Linux x64)
    Compute units   : 128
    Clock frequency : 3681 MHz

    Global memory bandwidth (GBPS)
      float   : 20.33
      float2  : 43.19
      float4  : 45.07
      float8  : 57.86
      float16 : 43.87

    Single-precision compute (GFLOPS)
      float   : 138.16
      float2  : 280.69
      float4  : 584.32
      float8  : 1093.85
      float16 : 1866.94

    Half-precision compute (GFLOPS)
      half   : 33.58
      half2  : 57.93
      half4  : 118.63
      half8  : 241.08
      half16 : 418.97

    Double-precision compute (GFLOPS)
      double   : 143.13
      double2  : 275.95
      double4  : 484.71
      double8  : 951.55
      double16 : 1663.51

    Integer compute (GIOPS)
      int   : 213.32
      int2  : 455.37
      int4  : 942.36
      int8  : 1711.27
      int16 : 2909.55

    Integer compute Fast 24bit (GIOPS)
      int   : 194.02
      int2  : 520.80
      int4  : 851.99
      int8  : 1827.18
      int16 : 2845.90

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 12.72
      enqueueReadBuffer               : 14.27
      enqueueWriteBuffer non-blocking : 11.82
      enqueueReadBuffer non-blocking  : 10.53
      enqueueMapBuffer(for read)      : 3315.04
        memcpy from mapped ptr        : 11.59
      enqueueUnmap(after write)       : 3206.16
        memcpy to mapped ptr          : 19.58

    Kernel launch latency : 146.01 us

from which we could conclude that the GPUs (regardless of platform) vastly outperform the CPUs in this benchmark, the vector size makes little difference on the Nvidia GPUs except for half precision where a size of 2 is a good initial guess, a vector size of 16 is a good initial guess on the CPUs for the optimum vector size. Notice in particular how half precision on the GPUs is considerably faster than single and double precision, but the opposite (slower) on the CPUs. For doing half precision computation, it is generally better to do it on the GPUs or on CPUs with better builtin half precision support (like Emmy Phase 3)

Nvidia Driver

The Nvidia driver for Nvidia GPUs comes with support for OpenCL out of the box. You do not have to load any modules after loading ocl-icd to use it, not even the cuda module.

Warning

Due to limitations in how the GWDG Modules and NHR Modules software stacks are built, only one platform can be used at a time.

PoCL

Portable Computing Language (PoCL) is a widely used OpenCL platform particularly well known for providing the host CPUs as an OpenCL device. But PoCL does support other devices, such as Nvidia GPUs via CUDA. The module pocl is available in the GWDG Modules and NHR Modules software stacks.

Warning

In the NHR Modules, the pocl module is built without Nvidia GPU support. If you want Nvida GPU support, you must instead use the pocl/VERSION_cuda-CUDAMAJORVERSION module. For example, pocl/5.0 would be the non-GPU variant and pocl/5.0_cuda-11 would be a GPU variant using CUDA 11.x.

Warning

Due to limitations in how the GWDG Modules and NHR Modules software stacks are built, only one platform can be used at a time.

To load a specific version, run

module load pocl/VERSION

and for the default version (non-GPU), run

module load pocl

Controlling Runtime Behavior

PoCL uses a variety of environmental variables to control its runtime behavior, which are described in the PoCL Documentation. Two important environmental variables are POCL_DEVICES and POCL_MAX_CPU_CU_COUNT.

By default, PoCL provides access to the cpu device and all non-CPU devices it was compiled for. Setting POCL_DEVICES to a space separated list of devices limits PoCL to providing access to only those kinds of devices. The relevant device names are in the table below. Setting POCL_DEVICES=cuda would limit PoCL to only Nvidia GPUs, while setting it to POCL_DEVICES="cpu cuda" would limit PoCL to the host CPUs (with threads) and Nvidia GPUs.

Name for `POCL_DEVICES`	Description
`cpu`	All CPUs on the host using threads
`cuda`	Nvidia GPUs using CUDA

At the present time, PoCL is unable to determine how many CPUs it should use based on the limits set by Slurm. Instead, it tries to use one thread for every core it sees on the host including hyperthread cores, even if the Slurm job was run with say -c 1. To override the number of CPUs that PoCL sees (and therefore threads it uses for the cpu device), use the environmental variable POCL_MAX_CPU_CU_COUNT. This is particularly bad when running a job that doesn’t use all the cores on a shared node, in which case it usually makes the most sense to either first run

export POCL_MAX_CPU_CU_COUNT="$SLURM_CPUS_PER_TASK"

if one wants to use all hyperthreads, or

export POCL_MAX_CPU_CU_COUNT="$(( $SLURM_CPUS_PER_TASK / 2))"

if one wants only one thread per physical core (not using all hyperthreads).

OpenMP

OpenMP (Open Multi-Processing) is a shared-memory parallelization extension to C, C++, and Fortran builtin to supporting compilers. OpenMP does parallelization across threads in a single process, rather than across processes in contrast to MPI. This means that variables, file handles, and other program state is shared in an OpenMP program with the downside that parallelization across multiple nodes is IMPOSSIBLE. However, it is possible to use OpenMP together with MPI to parallelize across processes as well, which can be on other nodes, to use the best of both frameworks (see Hybrid MPI + OpenMP for more information).

Compilers and Compiler Options

The following compilers that support OpenMP for C, C++, and/or Fortran and the compiler options to enable it are given in the table below.

Compiler	Option	SIMD Option	GPU Offload Option
GCC	`-fopenmp`	`-fopenmp-simd`
AMD Optimizing Compilers	`-fopenmp`	`-fopenmp-simd`
Intel Compilers	`-qopenmp`	`-qopenmp-simd`
LLVM	`-fopenmp`	`-fopenmp-simd`
Nvidia HPC Compilers, successor to the PGI compilers	`-mp`	`-mp`	`-mp gpu`

Setting Number of Threads

OpenMP will use a number of threads equal to the value in the environmental variable OMP_NUM_THREADS (or number of cores if it doesn’t exist). Set it with

export OMP_NUM_THREADS=VALUE

Since Slurm pins tasks to the cores requested, it is generally best to set it to some multiple or division of the number of cores in the task. On shared partitions (more than one job can run at a time), the environmental variable SLURM_CPUS_PER_TASK holds the number of cores per task in the job. On non-shared partitions (jobs take up whole node), the environmental variable SLURM_CPUS_ON_NODE holds the number of hypercores on the node and SLURM_TASKS_PER_NODE holds the number of tasks per node. Common values to set it to on a node with hyper-threading would be

Number of Threads	VALUE on shared partition	VALUE on non-shared partition
one per core in task	`$SLURM_CPUS_PER_TASK`	`$(( $SLURM_CPUS_ON_NODE / $SLURM_TASKS_PER_NODE / 2 ))`
one per hypercore in task	`$(( 2 * $SLURM_CPUS_PER_TASK ))`	`$(( $SLURM_CPUS_ON_NODE / $SLURM_TASKS_PER_NODE ))`
one per pair of cores in task	`$(( $SLURM_CPUS_PER_TASK / 2))`	`$(( $SLURM_CPUS_ON_NODE / $SLURM_TASKS_PER_NODE / 4 ))`

Notice that you can use the $(( MATH )) syntax for doing math operations in POSIX shells like Bash and Zsh.

Warning

OMP_NUM_THREADS is set to 1 by default on the whole HPC cluster in order to not overload login nodes. If you want to take advantage of OpenMP in compute jobs, you must change its value to some greater value.

Apache Spark

Introduction

Apache Spark is a distributed general-purpose cluster computing system.

Instead of the classic Map Reduce Pipeline, Spark’s central concept is a resilient distributed dataset (RDD) which is operated on with the help of a central driver program making use of the parallel operations and the scheduling and I/O facilities which Spark provides. Transformations on the RDD are executed by the worker nodes in the Spark cluster. The dataset is resilient because Spark automatically handles failures in the Worker nodes by redistributing the work to other nodes.

In the following sections, we give a short introduction on how to prepare a Spark cluster and run applications on it in the context of the GWDG HPC system.

Running a Spark Cluster in Slurm

Creating the cluster

Info

We assume that you have access to the HPC system already and are logged in to one of the frontend nodes. If that’s not the case, please check out our introductory documentation first.

Currently, Apache Spark version 3.1.1 and 3.5.1 are available. The shell environment is prepared by loading the module spark:

[u12345@glogin11 ~]$ module load spark/3.5.1

In the old scc-lmod software stack, Apache Spark is installed in version 3.4.0. The shell environment is prepared by loading the module spark/3.4.0:

[jdoe@gwdu101 ~]$ export MODULEPATH=/opt/sw/modules/21.12/scc/common:$MODULEPATH
[jdoe@gwdu101 ~]$ module load spark/3.4.0

We’re now ready to deploy a Spark cluster. Since the resources of the HPC system are managed by Slurm, the entire setup has to be submitted as a job. This can be conveniently done by running the deploy script, which accepts the same arguments as the sbatch command used to submit generic batch jobs.

The default job configuration the script will use is:

#SBATCH --partition scc-cpu
#SBATCH --time=0-02:00:00
#SBATCH --qos=2h
#SBATCH --nodes=4
#SBATCH --job-name=Spark
#SBATCH --output=scc_spark_job-%j.out
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=24

#SBATCH --partition standard96s
#SBATCH --time=0-02:00:00
#SBATCH --qos=2h
#SBATCH --nodes=2
#SBATCH --job-name=Spark
#SBATCH --output=nhr_spark_job-%j.out
##SBATCH --error=nhr_spark_job-%j.err
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=96
#SBATCH --exclusive

If you would like to override these default values, you can do so by handing over the Slurm parameters to the script:

u12345@glogin11 ~]$ scc_spark_deploy.sh --nodes=2 --time=02:00:00
Submitted batch job 872699

Especially, if you do not want to share the nodes’ resources, you need to add --exclusive.

[u12345@glogin12 ~]$ nhr_spark_deploy.sh --nodes=2 --time=02:00:00
Submitted batch job 3484640

In this case, the --nodes parameter has been set to specify a total amount of two worker nodes and --time is used to request a job runtime of two hours. If you would like to set a longer runtime, beside changing --time, remove the --qos=2h parameter or change it to another QOS.

Using the cluster

Once we have started deployment of our cluster, the job ID is reported back. We can use it to inspect if the job is running yet and if so, on which nodes:

u12345@glogin11 ~]$ squeue --jobs=872699
             JOBID PARTITION     NAME     USER ST       TIME  NODES  NODELIST(REASON)
            872699   scc-cpu    Spark   u12345  R       1:59      2  c[0284-0285]

The first node reported in the NODELIST column is running the Spark master. Its hostname is used to form a URL like spark://host:port that the spark applications, such as spark-submit and spark-shell need to connect to the master:

u12345@glogin11 ~]$ spark-shell --master spark://c0284:7077

Here, the Spark shell is started on the frontend node glogin11 and connects to the master on node c0284 and the default port 7077.

Scala code that is entered in this shell and parallelized with Spark will be automatically distributed across all nodes that have been requested initially. N.B.: The port that the application’s web interface is listening on (port 4040 by default) is also being reported in the startup message.

Stopping the cluster

Once the Spark cluster is not needed anymore, it can be shut down gracefully by using the provided shutdown script and specifying the job ID as an argument:

u12345@glogin11 ~]$ scc_spark_shutdown.sh 872699

[u12345@glogin12 ~]$ nhr_spark_shutdown.sh 3484640

Running Spark without a cluster

In case a single node is sufficient, Spark applications can be started inside a Slurm job without previous cluster setup - the --master parameter can be omitted in that case. If you want to quickly test your application on a frontend node or inside an interactive job, this approach is not feasible since by default all available CPU cores are utilized, which would disturb the work other users of the system. However, you can specify the URL local[CORES], where CORES is the amount of cores that the Spark application should utilize to limit your impact on the local system, for example:

Access and Monitoring

Once your Spark cluster is running, information about the master and workers is being printed to a file of the form $CLUSTER_spark_job-$JOBID.out in the current working directory you deployed the cluster from. For example, in the case at hand, the MasterUI, a built-in web interface that allows us to check the master for connected workers, the resources they provide as well as running applications and the resources they consume, is listening on the master on port 8082.

An SSH tunnel allows us to open the web interface in our browser via http://localhost:8080, by forwarding the remote port 8082 from the compute node running Spark to our local machine’s port 8080. You can set this up by starting OpenSSH with the -L parameter:

ssh -N -L 8080:c0284:8082 -l u12345 glogin-p3.hpc.gwdg.de

If this doesn’t work, you might have to try with a host jump option:

ssh -N -L 8080:localhost:8082 -J glogin-p3.hpc.gwdg.de -l u12345 c0284

Example: Approximating Pi

To showcase the capabilities of the Spark cluster set up thus far we enter a short Scala program into the shell we’ve started before. The local dataset containing the integers from 1 to 1E9 is distributed across the executors using the parallelize function and filtered according to the rule that the random point (x,y) with 0 < x, y < 1 that is being sampled according to a uniform distribution, is inside the unit circle. Consequently, the ratio of the points conforming to this rule to the total number of points approximates the area of one quarter of the unit circle and allows us to extract an estimate for the number Pi in the last line.

Configuration

By default, Spark’s scratch space is created in /tmp/$USER/spark. If you find that the 2G size of the partition where this directory is stored is insufficient, you can configure a different directory, for example in the scratch filesystem, for this purpose before deploying your cluster as follows:

export SPARK_LOCAL_DIRS=/scratch/users/$USER

Numeric Libraries

Many numeric libraries and packages are provided. Additional libraries can be installed via Spack.

Note

error while loading shared libraries: libxxx.so.yyy: cannot open shared object file: No such file or directory

BLAS / LAPACK

Intel MKL

FFT

Intel MKL

Intel MKL

The Intel® Math Kernel Library (Intel® MKL) provides optimized parallel math routines, particularly tuned for Intel CPUs. The MKL provides the following

Linear algebra
- BLAS
- Sparse BLAS
- LAPACK
- ScaLAPACK
DFT/FFT with compatibility layer for FFTW3

The Intel MKL’s module intel-oneapi-mkl and can be loaded (after loading the compiler and MPI modules you want to use first) by

module load intel-oneapi-mkl

Compiling And Linking

To link a library, the compiler and linker need at minimum

name of and path to the include files
name of and path to the library

Info

The FFTW3 library’s include files have the standard names, but are in a different place than the other MKL-related include files. But the FFTW3 functions are included in the MKL libraries, so searching for the typical FFTW3 libraries by filenames gives a negative result. This may be of importance for cmake - or configure -scripts trying to explore the system to generate the appropriate compiler- und linker options.

The module automatically sets the MKLROOT environmental variable. Additional useful environmental variables are:

export MKLPATH=$MKLROOT/lib
export MKLINCLUDE=$MKLROOT/include
export MKLFFTWINCLUDE=$MKLROOT/include/fftw

Intel MKL provides a convenient online tool for determining the compilation and linker flags to use at [https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor](Intel oneAPI Math Kernel Library Link Line Advisor)

The following should link a Fortran program myprog.f calling FFTW3 routines:

ifx myprog.f -I$MKLINCLUDE -I$MKLFFTWINCLUDE -L$MKLPATH -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm

For a configure script one has to explore its environment variables to fetch compiler and linker flags to the configuration process. This may be for example:

export CFLAGS=" -fPIC -O3 -I$MKLINCLUDE -I$MKLFFTWINCLUDE -Wl,-rpath=$LD_RUN_PATH"
export LIBS="-L$MKLROOT/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm"

Note,

-Wl,-rpath=$LD_RUN_PATH

ensures that the path to the MKL is stored in the binary. This ensures that no compiler module is needed during runtime and the originally specified library version is used.

Services

Arcana/RAG

The Arcana service works together with our Chat AI Service. Compatible models will use RAG (Retrieval-Augmented Generation) in the background to process files and use them as the basis and reference for answers, which often results in better and more factually correct responses. In order to use it, you need to activate your account for the Arcana page and set up an Arcana. Once this is done, you can share it with colleagues or the public. Details about this process can be found in the usage examples.

This is very useful for making large manuals or legal text accessible to an audience that is only interested in smaller aspects of the larger document. An example would be a list of study or exam regulations documents. These can be uploaded and made accessible to students using the Chat AI service. Students can ask questions and get responses that contain information from these documents.

For getting, please check out our Support page, there you will also find a short FAQ.

If all you need are some public Arcana links, please check out our publishes list.

Info

We need your help, if you have created an Arcana which you would like to promote to a larger audience, please reach out to us. We can put is here and communicated it to other users.

Why Use RAG?

The RAG service is designed for businesses and applications that require AI-generated responses to be accurate, explainable, and adaptable to real-world knowledge. Whether it is used for customer support, knowledge management, research, technical guidance, or expert systems, RAG ensures that AI remains intelligent, trustworthy, and useful in dynamic environments.

By integrating real-time data retrieval with AI-powered language generation, RAG transforms AI from a static knowledge tool into a dynamic, continuously learning system-ensuring that responses are always up to date, relevant, and reliable.

Key Terms

RAG: How the RAG works is described in the RAG Service page
Docling: The program indexing the files is described in the Docling process page

Service Overview

This Service has two sides. The first is the one where you access a specific Arcana using the Chat AI Service. Similarly, you can share this ID or the access link with others so they can use the RAG for directly accessing specific documents.

The second side is the Arcana manager where the RAG service can be set up. There it is possible to create and set up documents for use in the Arcana service in Chat AI. Additionally, the indexed material can be fine-tuned for specific retrievals if the existing indexing is not good or accurate enough.

It is possible to upload PDF, Text, and Markdown files to be used as base material.

Usage examples

We have written a getting started guide as well as a usage guide. Please refer to these sections directly if you need information about the interface.

The getting started guide details the arcana interface. This includes creating an account if you already have an Academiccloud account as well as setting up an Arcana. It also contains some information about how to generate the ID and tokes as well as how to use the access link.

The usage guide focuses on the Chat AI interface and how to access Arcanas there. It also gives some general guidance on how to do prompt engineering with Arcanas.

Warning

You need to use the Meta Llama 3.1 8B RAG model for Arcana to work. It is in the bottom of the model list.

AI Transparency Statement

Goal

Typical LLMs have the advantage, but also the problem, of having been trained on an incredible amount of data. This means that they know a lot, but are often unable to answer very specific questions. In these cases, LLMs are very prone to hallucination, which means that they basically make things up. One way to improve the performance of LLMs for very specific questions is to use Retrieval-Augmented-Generation (RAG). Here, users provide custom documents that contain the knowledge base they want to ask questions about later. Before an LLM responds to a user’s query, the most relevant documents previously provided by the user are retrieved and provided to the LLM as additional context.

In our innovative approach, we provide a reference section at the bottom of the answer where you can find the actual part of the document our RAG associated with your question. As these references are quoted directly from the documents provided by the user, there is no possibility of hallucination.

General Functionality

This section briefly outlines the individual steps required to process a RAG request.

Ingesting Documents

In order to use RAG, a user must first provide a knowledge base, i.e. a set of documents. These documents are uploaded, converted to markup and then indexed in a database. Here they are chunked and then transformed into a vector by a special embedding model. This vector is then stored in a vector database. In this way, each document provided by a user must be processed to build the knowledge base. This knowledge base, also known as “Arcana”, is stored at the GWDG until the user explicitly deletes it!

Submitting Requests

Once the general knowledge base has been built, a user can submit queries. Before these queries are sent to the LLM, they are also transformed into a vector representation using the same embedding model that was used to create the knowledge base in the previous step. This vector is then used in a similarity search on the previously created vector database to look for parts in the previously captured documents that have a similar meaning. A configurable number of similar vectors are then returned and passed to the LLM along with the user’s original request.

Generating an Answer

The LLM uses the additional information to provide a more specific response to the user’s request. This is already much less susceptible to hallucinations, but they are still possible. In our approach, we provide explicit references to the documents containing the chunks used by the LLM to formulate the answer. These references are provided in a special reference box at the bottom, which contains the actual citations from the original documents provided by the user. Therefore, no hallucination is possible on these references.

Further Considerations

Access to the Stored Documents

The ingested knowledge base can be freely shared with other users. To do this, the Arcana ID must be provided. IMPORTANT: This gives users full access to your documents and cannot (yet) be revoked individually!

Storage of your Documents

Your indexed data will remain on GWDG systems at all times. We will not share your documents with third parties.

Processing of your Requests

The processing of your requests, including generating the embeddings, retrieving the relevant documents and performing the inference, is all done on GWDG hardware. Neither your requests nor your indexed documents are shared with third parties.

Datenschutzhinweis

Verantwortlich für die Datenverarbeitung

Der Verantwortliche für die Datenverarbeitung im Sinne des Art. 4 Nr. 7 DSGVO und anderer nationaler Datenschutzgesetze der Mitgliedsstaaten sowie sonstiger datenschutzrechtlicher Bestimmungen ist die:

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Burckhardtweg 4
37077 Göttingen
Deutschland
Tel: +49 (0) 551 39-30001
E-Mail: support@gwdg.de
Website: www.gwdg.de

Vertreten durch den Geschäftsführer. Verantwortliche Stelle ist die natürliche oder juristische Person, die allein oder gemeinsam mit anderen über die Zwecke und Mittel der Verarbeitung von personenbezogenen Daten entscheidet.

Ansprechpartner / Datenschutzbeauftragter

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Datenschutzbeauftragter
Burckhardtweg 4
37077 Göttingen
Deutschland
Tel: +49 (0) 551 39-30001
E-Mail: support@gwdg.de

Allgemeines zur Datenverarbeitung

Geltungsbereich im Falle individueller Vereinbarungen

Im Falle eines Konflikts zwischen diesen Datenschutzbestimmungen und den Bedingungen einer bzw. mehrerer Vereinbarung(en), z.B. einem mit der GWDG geschlossenem Auftragsverarbeitungsvertrags, zwischen sind stets die Bedingungen dieser Vereinbarung(en) ausschlaggebend. Kardinalpflichten genießen stets Vorrang vor diesen allgemeinen Bestimmungen.

Sie können im Zweifelsfall bei Ihrem Institut in Erfahrung bringen, welche Datenschutzrichtlinien für Sie gelten.

Übersicht über den Service

Der Service, einschließlich der einzelnen Verarbeitungsschritte, ist detailliert beschrieben in unserem Transparency Statement.

Umfang der Verarbeitung personenbezogener Daten

Wir verarbeiten personenbezogene Daten unserer Nutzer grundsätzlich nur soweit dies zur Bereitstellung einer funktionsfähigen Website sowie unserer Inhalte und Leistungen erforderlich ist. Die Verarbeitung personenbezogener Daten unserer Nutzer erfolgt regelmäßig nur nach Einwilligung des Nutzers (Art. 6 Abs. 1 lit. a DSGVO). Eine Ausnahme gilt in solchen Fällen, in denen eine vorherige Einholung einer Einwilligung aus tatsächlichen Gründen nicht möglich ist und die Verarbeitung der Daten durch gesetzliche Vorschriften gestattet ist.

Rechtsgrundlage für die Verarbeitung personenbezogener Daten

Soweit wir für Verarbeitungsvorgänge personenbezogener Daten eine Einwilligung der betroffenen Person einholen, dient Art. 6 Abs. 1 lit. a EU-Datenschutzgrundverordnung (DSGVO) als Rechtsgrundlage. Bei der Verarbeitung von personenbezogenen Daten, die zur Erfüllung eines Vertrages, dessen Vertragspartei die betroffene Person ist, erforderlich ist, dient Art. 6 Abs. 1 lit. b DSGVO als Rechtsgrundlage. Dies gilt auch für Verarbeitungsvorgänge, die zur Durchführung vorvertraglicher Maßnahmen erforderlich sind. Soweit eine Verarbeitung personenbezogener Daten zur Erfüllung einer rechtlichen Verpflichtung erforderlich ist, der unser Unternehmen unterliegt, dient Art. 6 Abs. 1 lit. c DSGVO als Rechtsgrundlage. Für den Fall, dass lebenswichtige Interessen der betroffenen Person oder einer anderen natürlichen Person eine Verarbeitung personenbezogener Daten erforderlich machen, dient Art. 6 Abs. 1 lit. d DSGVO als Rechtsgrundlage. Ist die Verarbeitung zur Wahrung eines berechtigten Interesses unseres Unternehmens oder eines Dritten erforderlich und überwiegen die Interessen, Grundrechte und Grundfreiheiten des Betroffenen das erstgenannte Interesse nicht, so dient Art. 6 Abs. 1 lit. f DSGVO als Rechtsgrundlage für die Verarbeitung.

Nutzung der Chat-AI/RAG Webseite (Frontend)

Beschreibung und Umfang der Datenverarbeitung

Bei jedem Aufruf von https://chat-ai.academiccloud.de/ erfasst das System automatisiert Daten und Informationen vom Computersystem des aufrufenden Rechners. Folgende Daten werden hierbei in jedem Fall erhoben:

Datum des Zugriffs
Name des auf dem zugreifenden Gerät installierten Betriebssystems
Name des verwendeten Browsers
Quellsystem, über welches der Zugriff erfolgt ist
Die IP-Adresse des zugreifenden Geräts

Die Daten werden ebenfalls in den Logfiles unseres Systems gespeichert. Eine Speicherung dieser Daten zusammen mit anderen personenbezogenen Daten des Nutzers findet nicht statt.

Info

Sämtliche im Browser angezeigten Daten von Chat-AI werden nur Clientseitig im Browser der Nutzenden vorgehalten und nur bei der Benutzer-gewünschten Anfrage für die notwendige Verarbeitung an die Server übermittelt, d.h. während die Daten von den Backend-Modellen verarbeitet werden. Nach dem Ende einer Session im Browser sind keine Eingaben des Nutzers mehr vorhanden.

Falls https://chat-ai.academiccloud.de/arcanas/arcana genutzt wird, um neue Dokumente hochzuladen und zu indizieren, werden zusätzlich diese explizit zur Verfügung gestellten Daten verarbeitet. Diese hochgeladenen Dokumente bleiben bis zur expliziten Löschung des Arcanas auf unseren Systemen gespeichert. Der Vorgang des hochladens neuer Dokumente wird ebenfalls in den Logfiles unserer Systems gespeichert.

Allgemeine Nutzung von Modellen

Beschreibung und Umfang der Datenverarbeitung

Zu Abrechnungszwecken werden bei jeder Anfrage auf dem Server der GWDG die folgenden Daten abgespeichert und gelogged:

Datum der Anfrage
NutzerID
Länge der Anfrage und Antwort

Diese Daten werden außerdem in den Logfiles unseres Systems gespeichert. Diese Daten werden nicht zusammen mit anderen personenbezogenen Daten des Nutzenden gespeichert. Je nachdem ob lokal gehostete Modelle oder externe Modelle genutzt werden, gelten leicht unterschiedliche Datenschutzbestimmungen. Für die automatisiert generierten Antworten kann keinerlei Haftung übernommen werden. Antworten können gänzlich falsch sein, falsche Teilinformationen beinhalten oder können unrechtmäßigen Inhalt besitzen.

Dauer der Speicherung

Die Abrechnungsdaten werden ein Jahr gespeichert. Die Arcanas werden bis zur expliziten Löschung auf unseren Systemen gespeichert. Auch nach der expliziten Löschung können bis zur vollständigen Löschung noch Backups der Arcanas auf anderen Systemen vorhanden sein.

Nutzung von selbstgehosteten Modellen

Beschreibung und Umfang der Datenverarbeitung

Um die bei der GWDG gehosteten Modelle zu nutzen, werden die Eingaben/Anfragen von den Nutzenden auf den Systemen der GWDG verarbeitet. Der Schutz der Privatsphäre von Nutzeranfragen ist für uns von grundlegender Bedeutung. Aus diesem Grund speichert unser Dienst in Kombination mit den selbstgehosteten Modellen weder die Inhalte ihrer Anfragen (Chatverlauf) noch werden Aufforderungen oder Antworten zu irgendeinem Zeitpunkt auf einem dauerhaften Speicher abgelegt.

Dauer der Speicherung

Die Eingaben werden auf dem Server der GWDG nur während des Verarbeitung durch die Large Language Modelle selbst vorgehalten, d.h. während die Daten auf den eigenem Systemen verarbeitet werden.

Nutzung des Server-Sided RAG Systems

Beschreibung und Umfang der Datenverarbeitung

Grundsätzlich wird zwischen der Rolle der Entwickler:In und der Nutzer:In unterschieden. Die Entwickler:In stellt Kontextdaten bereit, die genutzt werden um serverseitig einen Index zu bauen. Dieser Index is persistent und kann über mehrere Sessions und von mehreren Nutzer:Innen genutzt werden. Dieser Index wird genutzt, um Nutzer:Innen Zugriff auf ein Large Language Model zu geben, welches spezifisches Wissen aus den bereitgestellten Kontextdaten nutzen kann um die individuellen Anfragen der Nutzenden zu beantworten. Der Entwickler:In lädt dazu die Daten via des RAGManagers hoch, wo diese in verschiedenen Datensätzen, Arcanas genannt, vorgehalten und indiziert werden. Der Zugriff auf ein Arcana ist mithilfe eines Passworts gesichert. Ein Arcana kann mit beliebig vielen Personen geteilt werden. Dabei muss jede:r Nutzer:In den Namen des Arcanas, bzw. dessen ID und das zugehörige Passwort kennen. Dabei ist wichtig klarzustellen, dass jede Person die Einblick in das Passwort nehmen konnte sich Zugriff das Arcana und somit auf das darin enthaltene Wissen verschaffen kann. Ein Arcana kann nur in Kombination mit einem bei der GWDG-gehosteten Open Source Modell genutzt werden. In den externen Modellen, die als solche explizit im Namen kenntlich gemacht sind, steht dieses Feature nicht zur Verfügung.

Beschreibung und Umfang der Datenverarbeitung

Die von den Entwickler:Innen bereitgestellten Kontextdaten werden serverseitig zu einem Arcana indiziert um mit einem Passwort gesichert. Das Arcana wird dann, falls eine ID und das korrekte Passwort von den Nutzenden bereitgestellt wurde, in den Kontext von den Open-Source Modellen in Chat AI exportiert.

Dauer der Speicherung

Die von den Entwickler:Innen bereitgestellte Kontextdaten werden dauerhaft bis zur expliziten Löschung durch die Entwickler:Innen gespeichert. Die Anfragen und Antworten für die Nutzenden werden weiterhin nur lokal auf den Clientsystemen der Nutzenden gespeichert, wie in der Sektion “Nutzung von selbstgehosteten Modellen” beschrieben. Die Anfrage ist ausschließlich während der Bearbeitung eben dieser auf den Servern der GWDG vorhanden.

Rechte der betroffenen Personen

Ihnen stehen verschiedene Rechte in Bezug auf die Verarbeitung ihrer personenbezogenen Daten zu. Nachfolgend sind diese aufgeführt, zusätzlich sind Verweise auf die Artikel (DSGVO) bzw. Paragraphen (BDSG (2018)) mit detaillierteren Informationen angegeben.

Auskunftsrecht (DSGVO Art. 15, BDSG §34)

Sie können von dem Verantwortlichen eine Bestätigung darüber verlangen, ob personenbezogene Daten, die Sie betreffen, von uns verarbeitet werden. Dies schließt das Recht ein, Auskunft darüber zu verlangen, ob die Sie betreffenden personenbezogenen Daten in ein Drittland oder an eine internationale Organisation übermittelt werden.

Recht auf Berichtigung (DSGVO Art. 16)

Sie haben ein Recht auf Berichtigung und/oder Vervollständigung gegenüber dem Verantwortlichen, sofern die verarbeiteten personenbezogenen Daten, die Sie betreffen, unrichtig oder unvollständig sind. Der Verantwortliche hat die Berichtigung unverzüglich vorzunehmen.

Recht auf Löschung / „Recht auf Vergessen werden“ / Recht auf Einschränkung der Verarbeitung (DSGVO Art. 17, 18, BDSG §35)

Sie haben das Recht, die unverzügliche Löschung ihrer personenbezogenen Daten vom Verantwortlichen zu verlangen. Alternativ können Sie die Einschränkung der Verarbeitung vom Verantwortlichen verlangen. Einschränkungen sind in der DSGVO und dem BDSG unter den genannten Artikeln bzw. Paragraphen genannt.

Recht auf Unterrichtung (DSGVO Art. 19)

Haben Sie das Recht auf Berichtigung, Löschung oder Einschränkung der Verarbeitung gegenüber dem Verantwortlichen geltend gemacht, ist dieser verpflichtet, allen Empfängern, denen die Sie betreffenden personenbezogenen Daten offengelegt wurden, diese Berichtigung oder Löschung der Daten oder Einschränkung der Verarbeitung mitzuteilen, es sei denn, dies erweist sich als unmöglich oder ist mit einem unverhältnismäßigen Aufwand verbunden. Ihnen steht gegenüber dem Verantwortlichen das Recht zu, über diese Empfänger unterrichtet zu werden.

Recht auf Datenübertragbarkeit (DSGVO Art. 20)

Sie haben das Recht, die Sie betreffenden personenbezogenen Daten, die Sie dem Verantwortlichen bereitgestellt haben, in einem strukturierten, gängigen und maschinenlesbaren Format zu erhalten. Ergänzend zur DSGVO ist festzustellen, dass sich die Datenübertragbarkeit bei Massendaten / Nutzerdaten ausschließlich auf die technische Lesbarkeit beschränkt. Das Recht auf Datenübertragbarkeit umfasst nicht, dass die vom Nutzer in einem proprietären Format erstellen Daten vom Verantwortlichen in ein “gängiges”, d.h. standardisiertes Format konvertiert werden.

Widerspruchsrecht (DSGVO Art. 21, BDSG §36)

Sie haben das Recht, Widerspruch gegen die Verarbeitung einzulegen, wenn diese ausschließlich auf Basis einer Abwägung des Verantwortlichen geschieht (vgl. DSGVO Art. 6 Abs. 1 lit f).

Recht auf Widerruf der datenschutzrechtlichen Einwilligungserklärung (DSGVO Art. 7 Abs. 3)

Sie haben das Recht, Ihre datenschutzrechtliche Einwilligungserklärung jederzeit zu widerrufen. Durch den Widerruf der Einwilligung wird die Rechtmäßigkeit der aufgrund der Einwilligung bis zum Widerruf erfolgten Verarbeitung nicht berührt.

Recht auf Beschwerde bei einer Aufsichtsbehörde (DSGVO Art. 77)

Unbeschadet eines anderweitigen verwaltungsrechtlichen oder gerichtlichen Rechtsbehelfs steht Ihnen das Recht auf Beschwerde bei einer Aufsichtsbehörde, insbesondere in dem Mitgliedstaat ihres Aufenthaltsorts, ihres Arbeitsplatzes oder des Orts des mutmaßlichen Verstoßes, zu, wenn Sie der Ansicht sind, dass die Verarbeitung der Sie betreffenden personenbezogenen Daten gegen die DSGVO verstößt.

Terms of use

§1 Allgemeine Bestimmungen

Es gelten die (AGB der Acacemic Cloud)[https://academiccloud.de/terms-of-use/].

§2 Registrierung und Zugriff

Der Zugriff auf diesen Dienst erfordert eine AcademicCloud-ID. Die Nutzung einer AcademicCloud-ID unterliegt der Annahme der Nutzungsbedingungen der “Academic Cloud”.

§3 Autorisierte Nutzung

Benutzer sind verpflichtet, die Technologie oder Dienste ausschließlich für autorisierte und rechtmäßige Zwecke zu nutzen, indem sie alle geltenden Gesetze, Vorschriften und Rechte anderer einhalten, einschließlich nationaler, bundesstaatlicher, staatlicher, lokaler und internationaler Gesetze.

§4 Entwicklung

Sie erkennen an, dass wir ähnliche Software, Technologie oder Informationen von anderen Quellen entwickeln oder beschaffen können. Diese Anerkennung begründet keine Einschränkungen unserer Entwicklungs- oder Wettbewerbsbemühungen.

§5 Updates

Die GWDG wird durch regelmäßge Updates, die innerhalb und außerhalb des dedizierten Wartungsfensters der GWDG innerhalb vertratbarer Fristen stattfinden kann, den Service aktuell halten.

§6 Verbote

(1) Benutzern ist es untersagt, diesen Dienst zur Übertragung, Erzeugung und Verbreitung von Inhalten (Eingabe und Ausgabe) zu verwenden, die:

Kinderpornographie oder sexuellen Missbrauch darstellen, auch die Fälschung, Täuschung oder Nachahmung desselben;
sexuell explizit sind und für nicht-bildende oder nicht-wissenschaftliche Zwecke eingesetzt werden;
diskriminierend sind, Gewalt, Hassreden oder illegale Aktivitäten fördern;
Datenschutzgesetze verletzen, einschließlich der Sammlung oder Verbreitung personenbezogener Daten ohne Zustimmung;
betrügerisch, irreführend, schädlich oder täuschend sind;
Selbstverletzung, Belästigung, Mobbing, Gewalt und Terrorismus fördern;
illegale Aktivitäten fördern oder geistige Eigentumsrechte und andere rechtliche und ethische Grenzen im Online-Verhalten verletzen;
versuchen, unsere Sicherheitsmaßnahmen zu umgehen oder Handlungen zu veranlassen, die etablierte Richtlinien vorsätzlich verletzen;
Einzelpersonen, insbesondere in Bezug auf sensible oder geschützte Merkmale, ungerechtfertigt oder nachteilig beeinflussen könnten;

(2) Benutzern ist es untersagt, folgende Aktivitäten durchzuführen:

Reverse-Engineering, Dekompilierung oder Disassemblierung der Technologie;
Unautorisierte Aktivitäten wie Spamming, Malware-Verbreitung oder störende Verhaltensweisen, die die Dienstqualität beeinträchtigen;
Modifizierung, Kopie, Vermietung, Verkauf oder Verbreitung unseres Dienstes;
Nachverfolgung oder Überwachung von Einzelpersonen ohne deren ausdrückliche Zustimmung.

(3) Geschützte Daten oder vertrauliche Information zu verarbeiten, insofern nicht die rechtliche Rahmenbedingungen erfüllt sind. Der Dienst ist zwar technisch dafür konzipiert sensible Daten verarbeiten zu können, d.h. Daten die u.a.

vertrauliche oder sensible Informationen enthalten;
sensible oder kontrollierte Daten beinhalten, etwa besonders geschützte Daten wie in Artikel 9(1) DSGVO gelistet;
oder Forschung mit menschlichen Probanden;

Allerdings muss der notwendige Rechtsrahmen bestehen oder geschaffen werden, um die rechtmäßige Verarbeitung sicherzustellen. Dies kann beispielsweise die Schließung eines Auftragsverarbeitungsvertrags gemäß Art. 28 DSGVO erfordern.

(4) Benutzern ist es untersagt, Daten in ein Arcana hochzuladen, die:

urheberrechtlich geschützt sind und die Benutzer nicht das Recht haben diese Inhalte mit den Nutzenden ihres Arcanas zu teilen;
Informationen enthalten die verboten oder verfassungswirdrig sind, z.B. aber nicht ausschließlich gemäß §86 StGB;
deren primärer Zweck es ist Straftaten zu ermöglichen oder zu begünstigen;
Informationen enthalten, die unter (1) ausgeschlossen wurden;

(5) Wenn Sie diesen Dienst im Auftrag einer Organisation und nicht als Privatperson nutzen, so sollte ein Auftragsverarbeitungsvertrag zwischen ihrer Organisation und der GWDG geschlossen werden. Sollten Unsicherheiten bzgl. Sicherheit oder Datenschutz des Dienstes vorhanden sein, so bitten wir um Kontakt des Datenschutzbeauftragen bei dem Mailpostfach support@gwdg.de mit dem Titel “Datenschutz ChatAI”.

(6) Für Forschungszwecke könnten in bestimmten Fällen die Nutzungsszenarien in (1) oder (4) gestattet sein. Hierbei müssen schriftliche Absprachen zwischen den Nutzern und der GWDG für den Einsatszweck getroffen werden.

(7) Insbesondere bei der Nutzung der serverseitigen RAG Systems dürfen in die Arcanas keine Kontextdaten hochgeladen werden, die unter (1), (3) und (4) aufgeführt sind. Falls ein gültiger Rechtsrahmen besteht, der eine solche Nutzung zulässt, so haben individuelle Vertragsabreden vorrang.

§7 Beendigung und Aussetzung

(1) Sie können die Nutzung des Services “Chat AI” und Ihre rechtlichen Beziehungen zu uns jederzeit beenden, indem Sie den Dienst nicht mehr nutzen. Wenn Sie ein Verbraucher in der EU sind, haben Sie das Recht, diese Bedingungen innerhalb von 14 Tagen nach Annahme durch Kontaktaufnahme mit dem Support zu widerrufen.

(2) Wir behalten uns das Recht vor, Ihren Zugriff auf den Dienst “Chat AI” auszusetzen oder zu beenden oder Ihr Konto zu deaktivieren, wenn Sie gegen diese Nutzungsrichtlinien verstoßen, wenn dies zur Einhaltung gesetzlicher Vorschriften erforderlich ist oder wenn Sie durch die Nutzung unseres Dienstes ein Risiko oder Schaden für uns, unsere Nutzer oder Dritte darstellen.

(3) Wir werden Ihnen vor der Deaktivierung Ihres Kontos eine Benachrichtigung senden, es sei denn, dies ist nicht möglich oder gesetzlich erlaubt. Wenn Sie glauben, dass Ihr Konto irrtümlich gesperrt oder deaktiviert wurde, können Sie sich an den Support wenden, um dies anzufechten.

(4) Wir behalten uns das Recht vor, rechtliche Schritte zu ergreifen, um unsere geistigen Eigentumsrechte und die Sicherheit unserer Nutzer zu schützen. Bei Verletzung dieser Bedingungen oder bei der Ausübung illegaler Aktivitäten durch die Nutzung unseres Dienstes können zivilrechtliche Sanktionen, Schadensersatz, Verwaltungsstrafen, Strafverfolgung oder andere rechtliche Optionen verfolgt werden.

§8 Korrektheit der Ergebnisse

Die von unseren Diensten generierten Ausgaben sind nicht immer einzigartig, korrekt oder präzise. Sie können Ungenauigkeiten enthalten, selbst wenn sie detailliert erscheinen. Benutzer sollten sich nicht ausschließlich auf diese Ergebnisse verlassen, ohne ihre Genauigkeit unabhängig zu überprüfen. Darüber hinaus kann die von unseren Diensten bereitgestellte Information unvollständig oder veraltet sein, und einige Ergebnisse können nicht mit unseren Perspektiven übereinstimmen. Daher sollten Benutzer Vorsicht walten lassen und die Dienste nicht für wichtige Entscheidungen verwenden, insbesondere in Bereichen wie Medizin, Recht, Finanzen und anderen professionellen Bereichen, in denen Fachwissen unerlässlich ist. Es ist wichtig zu verstehen, dass KI und maschinelles Lernen ständig weiterentwickelt werden, und obwohl wir uns bemühen, die Genauigkeit und Zuverlässigkeit unserer Dienste zu verbessern, sollten Benutzer immer die Genauigkeit der Ergebnisse bewerten und sicherstellen, dass sie ihren spezifischen Anforderungen entsprechen, indem sie sie vor der Verwendung oder Verbreitung manuell überprüfen. Darüber hinaus sollten Benutzer davon absehen, Ergebnisse im Zusammenhang mit Einzelpersonen für Zwecke zu verwenden, die erhebliche Auswirkungen auf diese haben könnten, wie z.B. rechtliche oder finanzielle Entscheidungen. Schließlich sollten Benutzer sich bewusst sein, dass unvollständige, ungenaue oder anstößige Ergebnisse auftreten können, die nicht die Ansichten der GWDG oder ihrer verbundenen Parteien widerspiegeln.

§9 Haftungsbeschränkung

(1) Allgemeine Haftungsbeschränkung: Die GWDG übernimmt keinerlei Haftung für Schadensersatzansprüche von Nutzer:Innen basierend auf der Inanspruchnahme des Services “Chat AI” und “RAG/Arcana”. Die hier dargelegte und in den weiteren Abschnitten genauer erklärte Haftungsbeschränkung leitet sich daraus ab, dass die GWDG ausschließlich eine Platform zur Nutzung von Sprachmodellen sowie zum Erstellen und Nutzen von indizierten Dokumenten (Arcanas) bereitstellt. Die GWDG kann auf dieser Platform keinerlei technische Maßnahmen bereit stellen, um in die durch die Sprachmodelle generierten Antworten dergestalt einzugreifen, dass ein Schaden bei unsachgemäßer Nutzung durch die Nutzenden ausgeschlossen werden kann. Daher verbleibt die vollständige Haftung für Schadensersatzansprüche durch unsachgemäße Nutzung dieser Platform bei den Nutzenden. Ebenso übernimmt die GWDG keinerlei Haftung für Schadensersatzansprüche aufgrund von unrechtmäßig hochgeladenen Dokumenten in ein Arcana, siehe §6, oder durch deren Verbreitung aufgrund einer unsachgemäßen Nutzung bzw. Verbreitung der ArcanaID’s und assozierten Zugangsschlüsseln. Hiervon ausgenommen sind Ansprüche basierend auf der Verletzung von Leben, Körper, Gesundheit, durch grobem Verschulden oder durch vorsätzliche oder grob fahrlässige Pflichtverletzung. Ebenfalls ist die Verletzung von Kardinalpflichten vom grundsätzlichem Haftungsausschluss ausgeschlossen.

(2) Urheberrecht: Die Nutzer:Innen des Service “Chat AI” und “RAG/Arcana” haben die vollständige und alleinige Verantwortung die geltenden Bestimmmungen des Urheberrechts zu beachten und einzuhalten. Die GWDG weist die Nutzer:Innen explizit darauf hin, dass die bereitgestellten Sprachmodelle von Dritten trainiert wurden und der GWDG kein Erklärung vorliegt, die die verwendeten Materialien auf freie Lizenzen einschränkt. Es kann folglich von der GWDG nicht ausgeschlossen werden, dass die bereitgestellten Sprachmodelle mithilfe urheberrechtlich geschützter Inhalte trainiert wurden. Antworten, die die Sprachmodelle den Nutzer:Innen geben, können folglich urheberechtlich geschützte Inhalte beinhalten. Die GWDG weißt explizit die Nutzenden darauf hin, dass ein direktes Weiterverwenden von erhaltenen Antworten nicht empfohlen ist. Die Prüfung des Urheberrechts für solche Fälle liegt alleine bei den Nutzer:Innen. Die GWDG übernimmt keinerlei Hauftung für etwaige Schadensersatzansprüche aus Urheberrechtsverletzungen. Antworten eines LLM’s welches zusätzlichen Kontext aus dem RAG System zur Verfügung bestellt bekommen hat können urheberrechtlichgeschützte Inhalte haben, wenn diese in den urspünglichen Dokumenten enthalten sind. Die Prüfung liegt hierbei allein in der Verantwortung der Nutzer:Innen. Die Verantwortung keine Dokumente hochzuladen und das Arcana mit Nutzr:Innen zu teilen, die keinen Zugriff auf diesen Inahlt haben dürfen liegt alleine bei den Nutzer:Innen der Arcanas. Die GWDG übernimmt keinrlei Haftung für Schäden die durch eine Weitergabe dieser Informationen entsteht.

(3) Vertraulichen Informationen: Wir können keine Haftung für Verlust/Veröffentlichung von Daten übernehmen, die die Nutzenden in Ihren Anfragen bereitstellen. Ebenso können wir keine Haftung für Verlust/Veröffentlichung von hochgeladenen und in einem Arcana idizierten Dokumenten übernehmen. Ausgenommen ist hiervon ein grobfahrlässiges Handeln sowie Kardinalpflichten.

(4) Patentrecht: Die Nutzer:Innen des Service “Chat AI” und “RAG/Arcana” haben die vollständige und alleinige Verantwortung die geltenden Bestimmmungen des Patentrechts zu beachten und einzuhalten. Antworten der bereitgestellten Sprachmodelle können in ihrer konzeptionellen Idee patentgeschützte Inhalte beinhalten. Die GWDG weißt explizit die Nutzenden darauf hin, dass ein direktes Weiterverwenden von Ideen und Konzepten die in den erhaltenen Antworten vermittelt werden, nicht empfohlen ist. Die Verantowrtung zur Prüfung des Patentschutzes für solche Fälle liegt alleine bei den Nutzer:Innen. Die GWDG übernimmt keinerlei Hauftung für etwaige Schadensersatzansprüche aus Patentverletzungen. Dies kann insbesondere dann passieren, wenn z.B. dem LLM via des RAG Systems Informationen einer Patentdatenbank bereitgestellt werden.

(5) Fehlinformationen: Die GWDG weist die Nutzer:Innen des Service “Chat AI” und “RAG/Arcana” darauf hin, dass es eine intrinsische Eigenschaft der bereitgestellten Sprachmodelle ist, sich Inhalte frei auszudenken - dies firmiert bei Sprachmodellen unter dem Stichwort “Halluzination”. Die Informationen, die in den Antworten enthalten sind können veraltet, frei erfunden, unpassend, aus dem Kontext genommen, oder falsch sein. Dies stellt keine Fehlfunktionen der bereitgestellten Platform dar, da dies technisch von dem Werkzeug der Sprachmodelle zu erwarten ist. Die unabhängige und kritische Überprüfung der erhaltenen Informationen obliegt einzig und alleine der Nutzer:Innen. Die GWDG übernimmt keinerlei Haftung für die Informationen, die in den Antworten der Sprachmodelle steckt. Weitere Information hierzu in “§8 Genauigkeit”. Die GWDG kann keine Haftung für die Richtigkeit und Aktualität der in einem Arcana bereitgestellten Dokumente übernehmen. Die Nutzer:Innen der Arcanas sind in der alleinigen Verantwortung dies sicherzustellen. Die Nutzer:Innen der LLM’s, die einen Arcana Kontext bereitgestellt bekommen, sind verpflichtet ebenfalls alle Informationen kritisch zu prüfen.

(6) Bei der Nutzung des serverseitigen RAG Systems tragen die Nutzer:Innen die alleinige Verantwortung für die Rechtmäßigkeit der Dokumente, die sie in die Arcanas hochladen, abspeichern und indizieren. Die GWDG übernimmt keinerlei Haftung für die Dokumente, die von den Nutzer:Innen in das RAG System hochgeladen wurden.

§10 Dienstleistungen von Drittanbietern

Unsere Dienstleistungen können die Integration von Software, Produkten oder Dienstleistungen von Drittanbietern umfassen, die als “Dienstleistungen von Drittanbietern” bezeichnet werden. Diese können Ausgaben liefern, die von diesen Dienstleistungen stammen, die als “Ausgaben von Drittanbietern” bekannt sind. Es ist wichtig zu verstehen, dass Dienstleistungen von Drittanbietern unabhängig arbeiten und von ihren eigenen Nutzungsbedingungen und -bestimmungen geregelt werden, die von unseren getrennt sind. Daher sollten Nutzer sich bewusst sein, dass wir nicht für Dienstleistungen von Drittanbietern oder deren zugehörige Nutzungsbedingungen und -bestimmungen verantwortlich sind. Wir kontrollieren diese Dienstleistungen nicht und sind daher nicht haftbar für Verluste oder Schäden, die durch deren Nutzung entstehen können. Nutzer entscheiden sich selbst dafür, mit Dienstleistungen von Drittanbietern zu interagieren, und übernehmen die volle Verantwortung für alle Ergebnisse, die sich daraus ergeben können. Darüber hinaus bieten wir keine Zusicherungen oder Garantien hinsichtlich der Leistung oder Zuverlässigkeit von Dienstleistungen von Drittanbietern.

§11 Feedback

Wir schätzen Ihr Feedback zu unseren Dienstleistungen und Produkten und ermutigen Sie, Ihre Gedanken zu teilen, um uns bei der Verbesserung zu helfen. Durch die Bereitstellung von Feedback verstehen Sie, dass wir es offenlegen, veröffentlichen, ausnutzen oder verwenden können, um unsere Angebote zu verbessern, ohne dass wir Ihnen eine Entschädigung schulden. Wir behalten uns das Recht vor, Feedback für jeden Zweck ohne Einschränkung durch Vertraulichkeitspflichten zu verwenden, unabhängig davon, ob es als vertraulich gekennzeichnet ist oder nicht.

§12 Datenschutz

Die Privatsphäre der Nutzeranfragen ist für uns von grundlegender Bedeutung. Weitere Information finden Sie in der (Datenschutzerklärung) [https://datenschutz.gwdg.de/services/chatai].

§13 Schlussbestimmungen

Die Allgemeinen Geschäftsbedingungen bleiben auch bei rechtlicher Unwirksamkeit einzelner Punkte in ihren übrigen Teilen verbindlich und wirksam. Anstelle der unwirksamen Punkte treten, soweit vorhanden, die gesetzlichen Vorschriften. Soweit dies für eine Vertragspartei eine unzumutbare Härte darstellen würde, wird der Vertrag jedoch im Ganzen unwirksam.

Migration Guide

Warning

This service is currently in beta phase and is updated regularly. The same applies to the documentation.

With the introduction of our new RAG Manager, we have improved the UI and indexing process. Due to architectural changes, the new interface is only partially backward compatible, particularly in terms of file storage and indexing.

This guide explains:

What still works between the old and new interface
What does not work
How to migrate your existing Arcanas from the old manager to the new System

The old version is still accessible during the transition period and will be removed within the next months. You can continue to use it here

What Does Work

All existing Arcanas created by the old manager continue to work
You can chat with a legacy Arcana via the chat interface.
You can still modify a legacy Arcana in the old interface.
The old interface remains fully operational.
You can delete an Arcana created in the old interface using the new interface.

What Does Not Work

Modifying an Arcana across interfaces is not fully supported:
- You can only delete an Arcana or files created in the old interface using the new interface.
- You cannot modify and see an Arcana created in the new interface using the old interface.
Re-indexing a legacy Arcana using the new interface does not work, because the file storage architecture differs.

Migration Steps

To migrate a legacy Arcana to the new RAG Manager:

1. Identify Legacy Arcanas

In the new interface, legacy Arcanas are marked with a “Legacy Arcana” tag in the Arcana details view.

Picture 1: Arcana Details View showing “Legacy Arcana” tag

Clicking the tag will show you more information and provide a link to this migration guide (see Picture 2).

Picture 2: Pop-up after clicking the tag, with migration guidance

2. Choose Your Migration Approach

You have two options for migrating a legacy Arcana:

Option A: Re-create the Arcana

Delete the Arcana entirely.
Create a new Arcana in the new interface with the same name.
Upload your files again.
Click “Generate Index” and follow the usual steps.

Option B: Replace Files in Existing Arcana

Delete all files in the existing legacy Arcana.
Upload the files to the same Arcana using the new interface.
Click “Index Generation”, then click “Delete Index” and finally click “Generate Index”.

This option preserves the original Arcana name and ID but removes all prior file content.

Setting up an Arcana

Warning

This service is currently in beta phase and is updated regularly. The same applies to the documentation.

Table of contents:

The process screenshots contain blank blocks. These are in the positions that will be replaced with the username.

Go to the main Arcana page and click on register. You will need to log in with your Academiccloud account first.

Welcome to RAG-Manager! Form to acknowledge terms and access service after profile verification.
The register page of the RAG Manager welcomes the user, shows the Academiccloud profile and shows a button for registering to the service by acknowledging the details.

Once this is done, you are now able to go to the dashboard which looks like this:

RAG-Manager new version announcement with steps to create, upload, generate index, and start chatting.
The Arcana manager dashboard has a main part and a menu bar. The menu bar contains the Home tab, Arcanas tab, a Profile tab and a Documentation tab. The main part contains a 'Quick Access' section and the 'Quick Start' guide. The bottom of the page has links to the Data privacy and terms of use.

You can find an overview of your profile under the profile tab. This is also the place, where you can delete your profile data.

Account settings page with username, join date, arcanas/files count, and a red 'Delete My Account' button with a warning.
The user profile tab shows the account information and a red box that reads danger zone and contains a button to delete the account.

Creating an Arcana

Once you have the dashboard open, you can navigate to the Arcanas section by either clicking on it in the navigation bar at the top or clicking on the “My Arcanas Box”. On the Arcanas section clicking on “+ Add New Arcana” will open the Create Popup. First, you need to specify a name for this arcana.

Dialog box to create a new Arcana with fields for name, security toggle, and privacy settings, with 'Cancel' and 'Save' buttons.
The main Arcana manager, once pressed on the new Arcana button, displays a window in which a name can be set and a new Arcana created.

Enabling the “Secure Arcana Name” slider will append a random string to the Arcana’s name. This ensures that only users who know the exact name of the Arcana can chat with it.

Note: All of your Arcanas are public by default! This means that anyone who knows the exact Arcana name can chat with it. However, this does not mean that they can directly access your files.

Once created, it should appear in your Arcana list.

Webpage displaying a list of Arcanas with options to create new ones, search, and view details like number of files, size, and index status.
The Arcana manager now contains a new Arcana which is not indexed and has no files. The actions section contains a button for open folder, delete, and access link.

By clicking on an Arcana, you can open the view of that Arcana content.

Webpage for RAG-Manager with 'Arcanas/Test 1' displayed, showing options to upload files, access links, index generation, and delete. No files are currently found.
The menu for this new Arcana has four buttons up top: upload a new file, access link, index generation, and delete Arcana. The table below does currently not contain any files.

Uploading Files

On the Arcana page click on “Upload Files” to start adding your files to the arcana

Upload Files dialog for 'Test 1' with a drag-and-drop area and a list of selected files, including 'ImageNet_CVPR2009.pdf', with 'Close' and 'All Done' buttons.
The upload a new file window has a button for choosing a file, as well as a cancel and done button.

Select one or multiple files to upload. You can upload files of various types. The following file formats are supported:

Text (.txt)
Markdown (.md)
Word (.docx, .dotx, .docm, .dotm)
Powerpoint(.pptx, .potx, .ppsx, .pptm, .potm, .ppsm)
PDF (.pdf)
HTML (.html, .htm, .xhtml)

Once uploaded, your files should appear.

Screenshot of a file management interface. It shows a list of files with columns for name, size, conversion status, and index status. A single PDF file is listed as 'ImageNet_CVPR2009.pdf' and is marked as 'Not Converted' and 'Not Indexed'.
Once uploaded, the menu for this Arcana now shows the file, which is not indexed and not converted.

You can check the file details by clicking on the “File Info” icon next to each file.

Screenshot of file details: “ImageNet_CVPR2009.pdf” is not converted or indexed, 3.35 MB, created 06/26/2025.
The File Details menu shows the file information as well as the status. Additionally, it gives the option to download the processed Markdown Plus file as well as a JSON file. Also, a new Markdown Plus file can be uploaded.

This is also the page, where after generating the index you can download the Docling JSON file or download the annotated Markdown file. Also, you have the option to upload a new version of the file. More importantly, you can download the Markdown file and update the markers mentioned in the Docling process. This is very useful to fine tune the splitting and marking of the file for better indexing and retrieval. Once these changes are made, uploading it will start a check process to make sure the content still matches the uploaded material.

Now that the file or files are uploaded, you can start the index generation and file conversion by clicking on “Index Generation”.

Screenshot of a vector database interface showing the index status as 'Not Indexed' with options to 'Generate Index' or 'Delete Index'.
Clicking on "Generate Index" opens a new small window prompting the user to confirm that the index should be generated.

Which will change the status to a blue “pending” and finally a green “indexed”.

Screenshot of a file list within 'Arcanas/Test 1', showing 1 file ('ImageNet_CVPR2009.pdf') with both conversion and index status as 'Pending'. Options to upload, access, generate index, and delete are visible.
The index status has changed from not indexed to pending.

Screenshot of a file list in 'Arcanas/Test 1' showing 'ImageNet_CVPR2009.pdf' is completed and indexed, with options to upload, access, generate index, and delete.
The index status has changed to Indexed.

Once your Arcana is indexed, you will be able to generate the Access Link and get the ID for this Arcana. Click on “Access Link” in the Actions tab of the respective Arcanas to generate it.

Screenshot of an 'Arcana Access Link' window displaying a masked Arcana ID and a long URL starting with 'https://chat-ai.academiccloud.de/chat?arcana='. A 'copy link' icon is present next to each.
Once the access link is created, it shows the link, which can be copied, as well as the Arcana ID.

You can click on the link or copy it into a new tab, which will open Chat AI with the Arcana ID preset into the advanced options.

Screenshot of a chat interface (ChatAI). The top panel shows model settings (Meta Llama 3.1 8B RAG) and on the right panel options for temperature and top_p. A text input field with 'Ask me' prompt is visible on the bottom.
The Chat-AI window has Arcana ID filled in.

Now you are able to share the link as well as the IDs for this Arcana.

Warning

You need to use RAG compatible models for Arcana to work. All RAG compatible models are marked with a little book icon, for example Meta Llama 3.1 8B RAG.

Updating files in an Arcana

There is an option to download and customize annotations for uploaded files, or modify previous manual or generated annotations. After the file has been uploaded and processed, you can update its annotation if necessary.

The details dialog for a PDF looks like this:

Shown are the file details in the RAG manager. Docling file information: JSON version with automatic annotations, download options available.
The File Details dialog for a PDF file lists information about the file as well as the index and conversion status. Below is the option to download the JSON output of the Docling process. At the bottom is the option to download the annotated Markdown file, upload an updated file, and an option to reset the annotated Markdown file.

For all file types, there is the option to download the annotated Text/Markdown file. In case of a PDF, this annotated Markdown file – also called Markdown Plus – is automatically generated by the Docling process. There is also the option to upload a manual annotated file (text or markdown). This is very useful to manually set the annotations or to update them. Additionally, many files have a JSON file from the Docling process that can be downloaded and viewed.

The available annotations are explained in the Docling process under the heading Markdown Plus Annotations.

Info

Partially annotated files do not work. They either have to be fully annotated and not annotated at all.

How to use Arcana

Warning

This service is currently in beta phase and is updated regularly. The same applies to the documentation.

In order to use an Arcana, you either need to set one up or use a publicly available Arcana. Once you have the Arcana ID in the respective field under the advanced options, you will be able to use it. You can see the Chat AI interface and for the currently selected model, the Meta Llama 3.1 8B RAG model, there is the option to set the Arcana ID.

This field will be set automatically if you use a link to access an Arcana. Similarly, the System Prompt can also be adjusted using a Link. This is most likely the case for an important Arcana such study information or manuals since setting the system prompt will enhance the phrasing and clarity of the response.

Accessing the shared link

In order to interact with an Arcana, you need to use a model that supports the feature and put the ID into the field in the “Settings” tab. Once this is done, you can interact with the Material using normal prompting. There are a few important points to be kept in mind. The Material is index using the Docling process, and therefore it is very important for the prompt to contain keywords and phrases that are or might be present in the document. This is crucial to retrieving the correct information. Otherwise, the model has a higher likelihood of hallucinating.

Understanding the output of a query

Once the model has responded to a prompt, you can immediately see the information attached to your prompt by the index material as now present in the response. Additionally, the actual reference is attached to the output below the response. The reference is directly generated from the Material that is indexed with this Arcana. Originally, the information might have been in a PDF, which got converted into an annotated Markdown file as you can learn in the Docling process. Therefore, the response will contain the reference in the Markdown format, and it will also be rendered as Markdown by the Chat AI interface.

Integration with chat AI

Currently, only a few models can interact with the Arcana middle ware process. These are marked with a book icon.

Example queries and responses

We will now ask the chatbot a question:

Prompt: What is ImageNet?
Example Response: ImageNet is a large-scale ontology of images built upon the backbone of the WordNet structure. It aims to populate the majority of the 80,000 synsets of WordNet with an average of 500-1000 clean and full-resolution images, resulting in tens of millions of annotated images organized by the semantic hierarchy of WordNet.

The Chat AI window shows a selected Arcana for the ImageNet paper with an example prompt. In the screenshot you can also see the response contains references from the original document.

Docling process

Warning

This service is currently in beta phase and is updated regularly. The same applies to the documentation.

The RAG Service provides users with an efficient way to upload and process PDF documents using Docling. The system converts uploaded PDFs into Markdown format while also automatically annotating them. This enhanced Markdown output, referred to as Markdown Plus, includes metadata and structural annotations for improved document parsing and customization.

This service is using a fork of the Docling API with many modifications, which will be published in the future.

Process Flow

File Upload: Users upload their PDF documents through the RAG interface.
Conversion Process: Once the PDF file is uploaded, the user can convert into the Markdown Plus format by clicking on Index Generation -> Generate Index. This will also convert the files.
Annotation & Metadata: The Markdown Plus output is automatically annotated with structural markers.
By clicking the “File Info” icon on the right, users can download the annotated Markdown file, adjust split markers, and re-upload it for further processing.

Shown are the file details in the RAG manager. Docling file information: JSON version with automatic annotations, download options available.
The File Details window for a file is shown with all the file details as well as the option to download either the JSON format or the Markdown Plus file. Additionally, there is an option to upload an updated Markdown Plus file.

5. Validation: Upon re-upload, a validation process ensures that the paging structure remains intact.

Markdown Plus Annotations

Docling generates a structured and annotated Markdown file using the following markers:

1. Page Marker

Format: [Page (number)]: #
Purpose: Indicates the beginning of a new page in the original PDF document.
Example:
```
[Page 1]: #
```

2. Vertical Position Marker

Format: [Y: (number)]: #
Purpose: Represents the approximate vertical position of an item on the page (The height of the pages is scaled to 1000 lines).
Details:
- Each page is divided into five sections.
- If no continuous item (such as a table or image) exists in a given section, a Y marker is assigned.
Example:
```
[Y: 300]: #
```

3. Split Marker

Format: [SPLIT]: #
Purpose: Defines segmentation points in the document for later processing.
Usage:
- The split markers guide document chunking for downstream applications.
- Users can manually adjust split markers in the Markdown Plus file before re-uploading.
Example:
```
[SPLIT]: #
```

Metadata Header

Each annotated Markdown file contains a header section with metadata about the document, including:

Author
Title
Description
Filename
Extension
Number of Pages
Version

For example:

---
Author: jkunkel1
Title: Title of the file
Description: ''
Filename: file name
Extension: pdf
Number of Pages: 20
Version: 1.0
---

User Interaction with Annotated Markdown

Download Markdown Plus: Users can export the annotated Markdown file for review.
Modify Split Markers: If desired, users can manually edit [SPLIT]: # markers to customize segmentation.
Re-Upload Modified File: The system verifies that the paging structure remains undisturbed before processing the document further.

Public Arcana links

Warning

This service is currently in beta phase and is updated regularly. The same applies to the documentation.

Here is a list of public links for Arcanas. These are sorted in categories and contain public material. Please refer to the How to use section of you need help working with these.

GWDG Services
- Contains knowledge about the GWDG Services and Documentation.
- Example question: Welche Dienste könnte ich als Forscher bei der GWDG nutzen?
Institute for Computer Science
- Contain knowledge about the computer science study track.
- Example question: Wie funktioniert der Studiengang angewandte Informatik?

In case you find a broken link, please let us know via our support contacts.

RAG Service

Warning

This service is currently in beta phase and is updated regularly. The same applies to the documentation.

RAG (Retrieval-Augmented Generation) is an advanced AI technique designed to improve the accuracy, reliability, and contextual relevance of AI-generated responses. Traditional AI models, such as large language models (LLMs), rely solely on pre-trained data to generate answers. While these models can provide insightful responses, they are limited by the information they were trained on, which may become outdated or may not cover specific topics in detail.

RAG overcomes these limitations by integrating an external retrieval process before generating a response. Instead of relying purely on static knowledge, RAG actively searches for relevant data from external sources, such as document databases, APIs, or knowledge repositories. The retrieved data is then fed into the AI model along with the original user query, ensuring that the response is fact-based, up to date, and contextually relevant.

This approach is particularly valuable for applications requiring real-time information access, such as customer support, research, healthcare, legal advice, and financial services. By leveraging external data, RAG enhances AI’s ability to provide more precise answers, reduces misinformation, and improves trust in AI-driven decision-making.

How the RAG Service Works

The RAG service follows a three-stage process: Retrieval, Augmentation, and Generation.

1. Retrieval Phase

The system maintains a structured external knowledge base, stored in Arcana-based ChromaDB. This database contains documents, articles, technical manuals, and other relevant data sources. When a user submits a query, the ChromaDB engine performs a search to find the most relevant documents. The retrieval process is powered by vector-based similarity matching, which identifies information that closely matches the meaning and context of the user’s query. This approach ensures that even if the exact wording of the query differs from stored data, the system can still locate relevant information.

2. Augmentation Phase

Once the system retrieves relevant documents, they are combined with the original user query to form an enriched input. This augmented input helps the AI model understand specialized or proprietary information that may not have been part of its original training data. The retrieved documents serve as a knowledge injection, ensuring that responses are grounded in verified, real-world data rather than relying on the model’s internal assumptions.

3. Generation Phase

The AI model processes the combined input (original query + retrieved documents). It then generates a response that integrates both its pre-trained knowledge and the newly retrieved information. This response is more accurate, relevant, and fact-based compared to responses generated by traditional AI models that lack external retrieval. The system can also provide source references, increasing transparency and allowing users to verify the information provided.

Key Benefits of the RAG Service

Improved Accuracy
- Since the AI retrieves real-world data before generating responses, it significantly reduces errors and outdated information.
- This ensures that responses are more reliable, precise, and factually correct.
Reduction of AI Hallucinations
- Traditional AI models sometimes generate responses that sound plausible but are actually incorrect or misleading.
- RAG minimizes this risk by ensuring that responses are anchored in retrieved, verifiable data, rather than being purely speculative.
Domain-Specific Customization
- Organizations can integrate proprietary databases, making the AI highly specialized for their industry or use case.
- Whether for healthcare, legal, finance, engineering, research, or customer support, RAG can be tailored to provide highly relevant responses.
Enhanced Explainability and Transparency
- Unlike traditional AI models, which provide answers without explaining their reasoning, RAG can cite sources for its responses.
- Users can trace back the information to the retrieved documents, improving trust and accountability in AI-generated content.
Access to Real-Time and Dynamic Knowledge
- Unlike static AI models that rely only on pre-trained knowledge, RAG can fetch and integrate the latest available information.
- This is especially useful for industries where information changes frequently, such as market trends, regulatory compliance, technical troubleshooting, and scientific research.
Better User Experience
- By retrieving and integrating the most relevant information, RAG allows AI to provide more complete and meaningful answers to user queries.
- This leads to better decision-making, improved efficiency, and a more user-friendly AI interaction.

Support and FAQ

Warning

This service is currently in beta phase and is updated regularly. The same applies to the documentation.

If you run into problem using this service please contact our KISSKI support

We also need your help providing publicly available Arcana links. If you have one that you think is relevant to a larger group of users, please also reach out to us so we can publish it.

FAQ

Which model is currently supported?

Only the Meta Llama 3.1 8B RAG model is able to use the Arcana feature. More will be made available soon. If you cannot access it, please contact us.

My model outputs a very long and unreadable reference?

This is currently the case, due to the output of the Docling conversion. We are working on parsing this output and printing it in a more pretty way.

Chat AI

Chat AI is a stand-alone LLM (large language model) web service that we provide, which hosts multiple LLMs on a scalable backend. It runs on our cloud virtual machine with secure access to run the LLMs on our HPC systems. It is our secure solution to commercial LLM services, where none of your data gets used by us or stored on our systems.

The service can be reached via a it’s web interface. To use the models via the API, refer to API Request, and to use the models via Visual Studio Code, refer to CoCo AI.

If all you need is a quick change of your persona, this is the page you are looking for.

Tip

You need an Academic Cloud account to access the AI Services. Use the federated login or create a new account. Details are on this page.

Current Models

Chat AI currently hosts a large assortment of high-quality, open-source models. All models except ChatGPT are self-hosted with the guarantee of the highest standards of data protection. These models run completely on our hardware and don’t store any user data.

For more detailed information about all our models, please refer to available models.

Web interface and usage

If you have an AcademicCloud account, the web interface can also easily be reached here. All models of ChatAI are free to use, for free, for all users, with the exception of the ChatGPT models, which are only freely available to public universities and research institutes in Lower Saxony and the Max Planck Society.

Choose a model suitable to your needs from the available models. After learning the basic usage, learn about ChatAI’s advanced features here

From the web interface, there are built-in actions that can make your prompts easier or better. These include:

Attach (+ button): Add files that the model use as context for your prompts.
Listen (microphone button): Speak to the model instead of typing.
Import/Export (upload/download button): If you have downloaded conversations from a previous ChatAI session or another service, you can import that session and continue it.
Footer (bottom arrow): Change the view to include the footer, which includes “Terms of use”, “FAQ”, etc. and the option to switch between English and German.
Light/Dark mode (sun/moon button): Toggle between light and dark mode.
Options: Further configuration options for tailoring your prompts and model more closely. These include :
- System prompt, which can be considered the role that the model should assume for your prompts. The job interview prompt above is an example.
- Completion options such as temperature and top_p sampling.
- Share button, which generates a shareable URL for Chat AI that loads your current model, system prompt, and other settings. Note that this does not include your conversation.
- Clear button, which deletes the entire data stored in your browser, removing your conversations and settings.
- Memory Settings, the system supports three memory modes to control conversational context:
  - None: Disables memory—each conversation is treated independently.
  - Recall: Adds previous messages to the system prompt for contextual continuity.
  - Learn: Extends Recall by updating memory with relevant parts of the current conversation, enabling a more natural dialogue flow.

Note: Memory is stored locally in the browser and does not affect external (OpenAI) models.

We suggest to set a system prompt before starting your session in order to define the role the model should play. A more detailed system prompt is usually better. Examples include:

“I will ask questions about data science, to which I want detailed answers with example code if applicable and citations to at least 3 research papers discussing the main subject in each question.”
“I want the following text to be summarized with 40% compression. Provide an English and a German translation.”
“You are a difficult job interviewer at the Deutsch Bahn company and I am applying for a job as a conductor”.

Completion options

Two important concepts to understand among completion options are temperature and top_p sampling.

temperature is a slider from 0 to 2 adjusting the creativity, with closer to 0 being more predictable and closer to 2 being more creative. It does this by expanding or flattening the probabilities of the next token (response building block).
top_p is a slider from 0 to 1 which adjusts the total population of probabilities considered for the next token. A top_p of 0.1 would mean that only the top 10% of cumulative probabilities are considered. Varying top_p has a similar effect on predictability and creativity as temperature, with larger values considered to increase creativity.

Predictable results, for tasks such as coding, require low values for both parameters, and creative results, for tasks such as brainstorming, require high values. See the table in the available models section for value suggestions.

Features

More comprehensive documentation for all features is found here.

Chat AI Tools

The settings window shows you an option to activate tools. Once activated, these tools are available:

Web Search
Image generation
Image editing
text to speech (tts)

Also, if you want to use Toolbox, meaning the image, video, and audio features of the models, you need to activate the tools in the settings.

Info

These tools only work using the models hosted by GWDG and KISSKI. The external models from OpenAI do not work with the tools.

Web Search

This tool works by creating a search query that can be used with search engines. The entire chat history is used and processed to create a short search query that generates a processable response. Once the response is retrieved from the search engine it is used together with the model to write a reply to the prompt.

Most importantly, this allows the response to contain recent information instead of the outdated information present in the model that was selected.

For more information, check out the Tools Documentation.

Acknowledgements

We thank Priyeshkumar Chikhaliya for the design and implementation of the web interface.

We thank all colleagues and partners involved in this project.

Citation

If you use Chat AI in your research, services or publications, please cite us as follows:

@misc{doosthosseiniSAIASeamlessSlurmNative2025,
  title = {{{SAIA}}: {{A Seamless Slurm-Native Solution}} for {{HPC-Based Services}}},
  shorttitle = {{{SAIA}}},
  author = {Doosthosseini, Ali and Decker, Jonathan and Nolte, Hendrik and Kunkel, Julian},
  year = {2025},
  month = jul,
  publisher = {Research Square},
  issn = {2693-5015},
  doi = {10.21203/rs.3.rs-6648693/v1},
  url = {https://www.researchsquare.com/article/rs-6648693/v1},
  urldate = {2025-07-29},
  archiveprefix = {Research Square}
}

Further services

If you have questions, please browse the FAQ first. For more detail on how the service works, you can read our research paper here. If you have more specific questions, feel free to contact us at kisski-support@gwdg.de.

Available Models

Chat AI provides a large assortment of state-of-the-art open-weight Large Language Models (LLMs) which are hosted on our platform with the highest standards of data protection. The data sent to these models, including the prompts and message contents, are never stored at any location on our systems. Additionally, Chat AI offers models hosted externally such as OpenAI’s GPT-5, GPT-4o, and o3.

Available models are regularly upgraded as newer, more capable ones are released. We select models to include in our services based on user demand, cost, and performance across various benchmarks, such as HumanEval, MATH, HellaSwag, MMLU, etc. Certain models are more capable at specific tasks and with specific settings, which are described below to the best of our knowledge.

List of open-weight models, hosted by GWDG

Organization	Model	Open	Knowledge cutoff	Context window in tokens	Advantages	Limitations	Recommended settings
🇺🇸 Meta	Llama 3.1 8B Instruct	yes	Dec 2023	128k	Fast overall performance	-	default
🇺🇸 OpenAI	GPT OSS 120B	yes	Jun 2024	128k	Great overall performance, fast	-	default
🇺🇸 Google	Gemma 3 27B Instruct	yes	Mar 2024	128k	Vision, great overall performance	-	default
🇨🇳 OpenGVLab	InternVL2.5 8B MPO	yes	Sep 2021	32k	Vision, lightweight and fast	-	default
🇨🇳 Alibaba Cloud	Qwen 3 235B A22B Thinking 2507	yes	Apr 2025	222k	Great overall performance, reasoning	-	temp=0.6, top_p=0.95
🇨🇳 Alibaba Cloud	Qwen 3 32B	yes	Sep 2024	32k	Good overall performance, multilingual, global affairs, logic	-	default
🇨🇳 Alibaba Cloud	Qwen QwQ 32B	yes	Sep 2024	131k	Good overall performance, reasoning and problem-solving	Political bias	default temp=0.6, top_p=0.95
🇨🇳 DeepSeek	DeepSeek R1 0528	yes	Dec 2023	32k	Great overall performance, reasoning and problem-solving	Censorship, political bias	default
🇨🇳 DeepSeek	DeepSeek R1 Distill Llama 70B	yes	Dec 2023	32k	Good overall performance, faster than R1	Censorship, political bias	default temp=0.7, top_p=0.8
🇺🇸 Meta	Llama 3.3 70B Instruct	yes	Dec 2023	128k	Good overall performance, reasoning and creative writing	-	default temp=0.7, top_p=0.8
🇺🇸 Google	MedGemma 27B Instruct	yes	Mar 2024	128k	Vision, medical knowledge	-	default
🇩🇪 VAGOsolutions x Meta	Llama 3.1 SauerkrautLM 70B Instruct	yes	Dec 2023	128k	German language skills	-	default
🇫🇷 Mistral	Mistral Large Instruct	yes	Jul 2024	128k	Good overall performance, coding and multilingual reasoning	-	default
🇫🇷 Mistral	Codestral 22B	yes	Late 2021	32k	Coding tasks	-	temp=0.2, top_p=0.1 temp=0.6, top_p=0.7
🇺🇸 intfloat x Mistral	E5 Mistral 7B Instruct	yes	-	4096	Embeddings	API Only	-
🇨🇳 Alibaba Cloud	Qwen 2.5 VL 72B Instruct	yes	Sep 2024	90k	Vision, multilingual	-	default
🇨🇳 Alibaba Cloud	Qwen 2.5 Coder 32B Instruct	yes	Sep 2024	128k	Coding tasks	-	default temp=0.2, top_p=0.1
🇩🇪 OpenGPT-X	Teuken 7B Instruct Research	yes	Sep 2024	128k	European languages	-	default

List of external models, hosted by external providers

Organization	Model	Open	Knowledge cutoff	Context window in tokens	Advantages	Limitations	Recommended settings
🇺🇸 OpenAI	GPT-5 Chat	no	Jun 2024	400k	Great overall performance, vision	-	default
🇺🇸 OpenAI	GPT-5	no	Jun 2024	400k	Best overall performance, reasoning, vision	-	default
🇺🇸 OpenAI	GPT-5 Mini	no	Jun 2024	400k	Fast overall performance, vision	-	default
🇺🇸 OpenAI	GPT-5 Nano	no	Jun 2024	400k	Fastest overall performance, vision	-	default
🇺🇸 OpenAI	o3	no	Oct 2023	200k	Good overall performance, reasoning, vision	outdated	default
🇺🇸 OpenAI	o3-mini	no	Oct 2023	200k	Fast overall performance, reasoning	outdated	default
🇺🇸 OpenAI	GPT-4o	no	Oct 2023	128k	Good overall performance, vision	outdated	default
🇺🇸 OpenAI	GPT-4o Mini	no	Oct 2023	128k	Fast overall performance, vision	outdated	default
🇺🇸 OpenAI	GPT-4.1	no	June 2024	1M	Good overall performance	outdated	default
🇺🇸 OpenAI	GPT-4.1 Mini	no	June 2024	1M	Fast overall performance	outdated	default

Open-weight models, hosted by GWDG

The models listed in this section are hosted on our platform with the highest standards of data protection. The data sent to these models, including the prompts and message contents, are never stored at any location on our systems.

Meta Llama 3.1 8B Instruct

The standard model we recommend. It is the most lightweight with the fastest performance and good results across all benchmarks. It is sufficient for general conversations and assistance.

OpenAI GPT OSS 120B

In August 2025 OpenAI released the gpt-oss model series, consisting of two open-weight LLMs that are optimized for faster inference with state-of-the-art performance across many domains, including reasoning and tool use. According to OpenAI, the gpt-oss-120b model achieves near-parity with OpenAI o4-mini on core reasoning benchmarks.

Meta Llama 3.3 70B Instruct

Achieves good overall performance, on par with GPT-4, but with a much larger context window and more recent knowledge cutoff. Best in English comprehension and further linguistic reasoning, such as translations, understanding dialects, slang, colloquialism and creative writing.

Google Gemma 3 27B Instruct

Gemma is Google’s family of light, open-weights models developed with the same research used in the development of its commercial Gemini model series. Gemma 3 27B Instruct is quite fast and thanks to its support for vision (image input), it is a great choice for all sorts of conversations.

Google MedGemma 27B Instruct

MedGemma 27B Instruct is a variant of Gemma 3 suitable for medical text and image comprehension. It has been trained on a variety of medical image data, including chest X-rays, dermatology images, ophthalmology images, and histopathology slides, as well as medical text, such as medical question-answer pairs, and FHIR-based electronic health record data. MedGemma variants have been evaluated on a range of clinically relevant benchmarks to illustrate their baseline performance.

Qwen 3 235B A22B Thinking 2507

Expanding on Qwen 3 235B A22B, one of the best-performing models of the Qwen 3 series, Qwen 3 235B A22B Thinking 2507 has a significantly improved performance on reasoning tasks, including logical reasoning, mathematics, science, coding, and academic benchmarks. It is an MoE model with 235B total parameters and 22B activated parameters, and achieves state-of-the-art results among open-weights thinking models.

Qwen 3 32B

Qwen 3 32B is a large dense model developed by Alibaba Cloud released in April 2025. It supports reasoning and outperforms or is at least on par with other state-of-the-art reasoning models such as OpenAI o1 and DeepSeek R1.

Qwen QwQ 32B

Developed by Alibaba Cloud, QwQ is the reasoning model of the Qwen series of LLMs. Compared to non-reasoning Qwen models, it achieves significnatly higher performance in tasks that require problem-solving. QwQ 32B is lighter and faster than DeepSeek R1 and OpenAI’s o1, but achieves comparable performance.

Qwen 2.5 VL 72B Instruct

A powerful Vision Language Model (VLM) with competitive performance in both langauge and image comprehension tasks.

Qwen 2.5 Coder 32B Instruct

Qwen 2.5 Coder 32B Instruct is a code-specific LLM based on Qwen 2.5. It has one of the highest scores on code-related tasks, on par with OpenAI’s GPT-4o, and is recommended for code generation, code reasoning and code fixing.

DeepSeek R1 0528

Developed by the Chinese company DeepSeek (深度求索), DeepSeek R1 was the first highly-capable open-weights reasoning model to be released. In the latest update, DeepSeek R1 0528, its depth of reasoning and inference capabilities has increased. Although very large and quite slow, it achieves one of the best overall performances among open models.

Warning

DeepSeek models, including R1, have been reported to produce politically biased responses, and censor certain topics that are sensitive for the Chinese government.

DeepSeek R1 Distill Llama 70B

Developed by the Chinese company DeepSeek (深度求索), DeepSeek R1 Distill Llama 70B is a dense model distilled from DeepSeek-R1 but based on LLama 3.3 70B, in order to fit the capabilities and performance of R1 into a 70B parameter-size model.

Llama 3.1 SauerkrautLM 70B Instruct

SauerkrautLM is trained by VAGOsolutions on Llama 3.1 70B specifically for prompts in German.

Mistral Large Instruct

Developed by Mistral AI, Mistral Large Instruct 2407 is a dense language model with 123B parameters. It achieves great benchmarking scores in general performance, code and reasoning, and instruction following. It is also multi-lingual and supports many European and Asian languages.

Codestral 22B

Codestral 22B was developed by Mistral AI specifically for the goal of code completion. It was trained on more than 80 different programming languages, including Python, SQL, bash, C++, Java, and PHP. It uses a context window of 32k for evaluation of large code generating, and can fit on one GPU of our cluster.

InternVL2.5 8B MPO

A lightweight, fast and powerful Vision Language Model (VLM), developed by OpenGVLab. It builds upon InternVL2.5 8B and Mixed Preference Optimization (MPO).

OpenGPT-X Teuken 7B Instruct Research

OpenGPT-X is a research project funded by the German Federal Ministry of Economics and Climate Protection (BMWK) and led by Fraunhofer, Forschungszentrum Jülich, TU Dresden, and DFKI. Teuken 7B Instruct Research v0.4 is an instruction-tuned 7B parameter multilingual LLM pre-trained with 4T tokens, focusing on covering all 24 EU languages and reflecting European values.

External models, hosted by external providers

Warning

These OpenAI models are hosted on Microsoft Azure, and Chat AI only relays the contents of your messages to their servers. Microsoft adheres to GDPR and is contractually bound not to use this data for training or marketing purposes, but they may store messages for up to 30 days. We therefore recommend the open-weight models, hosted by us, to ensure the highest security and data privacy.

OpenAI GPT-5 Series

Released in August 2025, OpenAI’s GPT-5 series models achieve state-of-the-art performance across various benchmarks, with a focus on coding and agentic tasks. The series consists of the following four models along with their intended use cases:

OpenAI GPT-5 Chat: Non-reasoning model. Designed for advanced, natural, multimodal, and context-aware conversations.
OpenAI GPT-5: Reasoning model. Designed for logic-heavy and multi-step tasks.
OpenAI GPT-5 Mini: A lightweight variant of GPT-5 for cost-sensitive applications.
OpenAI GPT-5 Nano: A highly optimized variant of GPT-5. Ideal for applications requiring low latency.

OpenAI o3

Released in April 2025, OpenAI’s o3-class models were developed to perform complex reasoning tasks across the domains of coding, math, science, visual perception, and more. These models have an iterative thought process, and therefore take their time to process internally before responding to the user. The thought process for o3 models are not shown to the user.

OpenAI GPT-4o

GPT 4o (“o” for “omni”) is a general-purpose model developed by OpenAI. This model improves on the relatively older GPT 4, and supports vision (image input) too.

OpenAI GPT-4.1

OpenAI’s GPT-4.1-class models improve on the older GPT-4 series. These models also outperform GPT-4o and GPT-4o Mini, especially in coding and instruction following. They have a large context window size of 1M tokens, with improved long-context comprehension, and an updated knowledge cutoff of June 2024.

OpenAI o3 Mini

This was developed as a more cost-effective and faster alternative to o3.

OpenAI GPT-4o Mini

This was developed as a more cost-effective and faster alternative to GPT 4o.

OpenAI GPT-4.1 Mini

This was developed as a more cost-effective and faster alternative to GPT-4.1.

OpenAI o1 and o1 Mini

OpenAI’s o1-class models were developed to perform complex reasoning tasks. These models have now been superceded by the o3-series, and are therefore no longer recommended.

OpenAI GPT-3.5 and GPT-4

These models are outdated and not available anymore.

Chat AI FAQ

Data Privacy

Are my conversations or usage data used for AI training or similar purposes?

No, whether you use internal or external models, your conversations and data are not used to train any AI models.

When using internal models, are my messages and conversations stored on your servers at any stage?

No, user messages and AI responses are not stored at any stage on our servers. Once your message is sent and you receive the response, the conversation is only available in your browser.

What data does Chat AI keep when I access the service?

We do not keep any conversations or messages on our servers. We only record some usage stastistics, in order to monitor the load on our service and improve the user experience. This includes usernames, timestamps, and the models/services that were requested. Everything else the ChatBot remembers (like the history of your last conversation, etc.) is only stored locally in your browser.

When using external models, are my messages and conversations stored on Microsoft’s servers at any stage?

While we do not keep any conversations or messages on our servers, Microsoft retains the right to store messages/conversations for up to 30 days in order to prevent abuse. Since the request is sent directly from GWDG’s servers, no user information is included in the requests to Microsoft. For more information, see: https://learn.microsoft.com/en-us/legal/cognitive-services/openai/data-privacy

Availability

My institution is interested in using Chat AI. Can we advertise it to our users? Would you be able to handle an additional load for XXX users?

For large institutions, please contact us directly at info@kisski.de

Are all the models available with Chat-AI for free?

All models accessible to a user with an AcademicCloud account are for free, with the exception of the OpenAI GPT… (external) models. These models are only freely available to public universities and research institutes in Lower Saxony and the Max Planck Society.

Usage

Why is model xxx taking longer than usual to respond?

There can be multiple reasons for this.

Most likely, your conversation history became larger over time and you didn’t clear it. Note that each time you send a message, the entire conversation history has to be processed by the model, which means a longer conversation history takes longer to process, and also uses more input tokens.
If the model responds slowly even when the conversation is empty, it could be due to high load, esp. during peak hours, or an issue with the hardware running the model on our infrastructure. You can wait a little or switch to a different model and see if the response time improves. Feel free to reach out to support if the problem persists.
Check your internet connection. It’s possible that this is caused by a slow or high-latency connection, esp. if you notice no difference when changing the model.

Can Chat AI process my images?

Yes, as long as the model supports it. Simply select a model that supports image input, as illustrated with the camera icon, then attach an image using the picture button in the prompt textbox. Note that some models may not support attaching more than one image at a time.

Can Chat AI process my PDF files?

Yes! Simply use the “attach text” button in the prompt textbox and select your PDF file. You will see the file in your attachments list as well as a “process” button. Note that PDF files must be processed before you can send a message to the model. Depending on the size and contents of your PDF file, this may take a while, or even fail if the file is too large. Once the file is processed, you can simply send a message to the model and its contents will be attached to your message.

OpenAI models

Can I use my own system prompts with the OpenAI (external) models?

No, sorry. The system prompt used by the OpenAI models can’t be changed by end users. Please use our internal models if you need to set custom system prompts.

Why is o1 and o1-mini slower / why can’t I get responses from o1 and o1-mini?

The o1 and o1-mini models have internal reasoning, meaning they need much more time to process a request. Furthermore, Microsoft’s API does not support streaming for these models yet, therefore Chat AI has to wait until the entire response is generated by the model before any data is received. In some cases, especially when there is a long conversation history, this can take so long that the connection times out and the request fails with a “Service Unavailable” error.

Are the OpenAI GPT… (external) models the real ChatGPT/GPT-4/… by OpenAI?

They are. We have signed a contract with Microsoft to be able to access the models running in their Azure cloud. Since the service costs money, it is only available to users in Lower Saxony or from the Max Planck Society. Thank you for your understanding.

Why does GPT-4 refer to itself as GPT-3 when I ask what model it is?

This is a known issue when using GPT-4 via it’s API, see: https://community.openai.com/t/gpt-4-through-api-says-its-gpt-3/286881 Nevertheless, the model is in fact GPT-4, even if it states otherwise.

Data Privacy Notice

Note that this document in provided for supporting english-speaking users, the legally binding document is the German document.

Data Processor

The responsible party for data processing within the meaning of Art. 4 No. 7 GDPR and other national data protection laws of the member states as well as other data protection regulations is the:

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Burckhardtweg 4
37077 Göttingen
Göttingen, Germany
Tel: +49 (0) 551 39-30001
E-mail: support@gwdg.de
Website: www.gwdg.de

Represented by the managing director. The controller is the natural or legal person who alone or jointly with others determines the purposes and means of the processing of personal data.

Contact person / Data protection officer

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Datenschutzbeauftragter
Burckhardtweg 4
37077 Göttingen
Göttingen, Germany
Phone: +49 (0) 551 39-30001
E-mail: support@gwdg.de

General information on data processing

Overview of the service

The ChatAI service consists of several components, particularly a web frontend and large language models in the backend. The frontend provides users with a web interface to directly enter user queries via a browser. Additionally, users can select their desired model and adjust certain settings. The frontend forwards all requests to the selected model backend. For data privacy reasons, a distinction is made between models hosted locally by the GWDG and external models provided by other vendors, with the latter being clearly marked as such. The backend is hosted via the GWDG’s SAIA platform, which receives all requests and forwards them to the appropriate model. In the case of external models, the requests—specifically, the history transmitted from the browser, including intermediate model texts, and any “memories” created by the user—are forwarded to the respective external provider. For self-hosted models, requests are processed solely on GWDG’s systems.

Additionally, users can activate so-called tools (“GWDG Tools” in the frontend, “Tools” in the OpenAI API) either through the frontend or via API. These tools intervene in user requests and provide a wide range of enhanced functionalities. Most of the offered tools utilize services provided by the GWDG. However, certain functionalities (e.g., web search) can only be delivered through external services, and these are marked in the frontend with a data privacy warning.

Scope of the processing of personal data

We only process our users’ personal data to the extent necessary to provide a functional website and our content and services. The processing of personal data of our users takes place regularly only with the consent of the user (Art. 6 para. 1 lit. a GDPR). An exception applies in cases where prior consent cannot be obtained for factual reasons and the processing of the data is permitted by law.

Legal basis for the processing of personal data

Insofar we obtain consent from the data subject for processing operations involving personal data, Article 6 (1) lit. (a) of the EU General Data Protection Regulation (GDPR) is the legal basis for personal data processing.

When processing personal data that is necessary for the performance of a contract to which the data subject is party, Article 6 (1) lit. (b) GDPR is the legal basis. This also applies to processing operations that are necessary for the performance of pre-contractual measures.

Insofar as the processing of personal data is necessary for compliance with a legal obligation to which our company is subject, Article 6 (1) lit. (c) GDPR is the legal basis.

Where the processing of personal data is necessary in order to protect the vital interests of the data subject or another natural person, the legal basis is Article 6 (1) lit. (d) GDPR.

If the processing is necessary to protect a legitimate interest of our company or a third party and the interests, fundamental rights and freedoms of the data subject do not outweigh the first-mentioned interest, Art. 6 (1) lit. f DSGVO is the legal basis for the processing.

Use of the Chat-AI website (frontend)

Description and scope of data processing

Each time https://chat-ai.academiccloud.de/ is accessed, the system automatically collects data and information from the computer system of the accessing computer. The following data is collected in each case:

Date of access
Name of the operating system installed on the accessing device
Name of the browser used
Source system via which the access was made
The IP address of the accessing device

The data is also stored in the log files of our system. This data is not stored together with other personal data of the user. All Chat-AI data displayed in the browser is only stored on the client side in the user’s browser and is only transmitted to the server for the necessary processing when the user requests it, i.e. while the data is being processed by the backend models. After the end of a session in the browser, no more user input is available.

General use of models

Description and scope of data processing

For billing purposes, the following data is stored and logged on the GWDG server for each request:

Date of the request
user ID
Length of the request and response

This data is also stored in the log files of our system. This data is not stored together with other personal data of the user. Depending on whether locally hosted models or external models are used, slightly different data protection provisions apply. No liability can be accepted for the automatically generated answers. Answers may be completely incorrect, contain incorrect partial information or may have unlawful content.

Duration of storage

The billing data is stored for one year.

Use of self-hosted models

Description and scope of data processing

In order to use the models hosted by the GWDG, the user’s input/requests are processed on the GWDG’s systems. Protecting the privacy of user requests is of fundamental importance to us. For this reason, our service in combination with the self-hosted models does not store the contents of the requests (chat history), nor are requests or responses stored on a permanent memory at any time.

Duration of storage

The entries are only stored on the GWDG server during processing by the Large Language Models themselves, i.e. while the data is being processed on their own systems.

Use of external models from OpenAI

Description and scope of data processing

In order to use the OpenAI models, we send the respective request (user input) from our server to the Microsoft servers (external service provider). The following data is forwarded to fulfil the service

User request

Information about the users themselves is not forwarded by GWDG. However, the user’s enquiry is forwarded unfiltered, i.e. personal information contained in the enquiry itself is forwarded to the external service provider. The GWDG service is based on the Data Processing Addendum ( https://learn.microsoft.com/en-us/legal/cognitive-services/openai/data-privacy, https://www.microsoft.com/licensing/docs/view/Microsoft-Products-and-Services-Data-Protection-Addendum-DPA). This means that there is an adequacy decision in accordance with the European General Data Protection Regulation, but data transmission to third parties cannot be ruled out by GWDG.

The requests are anonymised by the GWDG servers and are send to the external service provider where they are only logged for up to 30 days in accordance with the Microsoft Data Processing Addendum in the event of an attempt at misuse, e.g. to create hate or sexualised content. This happens automatically if the backend detects an attempt at misuse. It cannot be ruled out that legitimate requests may be incorrectly categorised and logged as attempted abuse.

Possibility of objection and removal

The recording of the user’s input and the processing by Microsoft is mandatory for the provision of the external models. Consequently, there is no possibility for the user to object.

Use of Research Partner Models

Description and Scope of Data Processing

GWDG has research partners who host models externally on their compute resources. For this purpose, GWDG forwards the corresponding user request to the research partners. Information about the users themselves is not forwarded. However, the user requests are forwarded unfiltered, i.e. personal data contained in the request itself is forwarded to the research partners. The GWDG has a research contract with the relevant research partners. Corresponding models that are hosted by research partners are marked within the ChatAI web interface by a following “(Research Partner)”. The use of these models is at your own risk.

Use of models hosted at LUIS

Leibniz University IT Services (LUIS) operates an inference cluster as part of the KISSKI project, which host some large language models. These models are labelled with “LUIS” in the menu of the ChatAI service.

Description and scope of data processing

In order to use the models hosted by LUIS, the user’s input/requests are processed on the LUIS systems. Protecting the privacy of our users data is of fundamental importance to us. For this reason, our service, in combination with the models hosted at LUIS does not store the chat history nor are requests or answers stored on a permanent storage at any time.

Duration of storage

The input is only stored on the servers hosted by LUIS during processing of the language models, i.e., while a response to the request is being generated.

Use of Tools / GWDG Tools

Tools extend the capabilities and power of ChatAI, for example through internet search, vector databases, external MCP servers, or the generation of images, audio, etc. Unless explicitly stated otherwise, the tools provided are internal to GWDG. Web search and MCP server services are external services. Tools must be selected by users via opt-in in the frontend or requested through the API. The tools provided by GWDG are typically made known to the selected model, which then decides—based on the user request—whether to use one or multiple tools (e.g., when the user instructs the model to generate an image). In such cases, the tools are invoked with parameters dependent on the user request. The vector database (Arcana) is a special case. All tool activities are transparently displayed in ChatAI for users.

GWDG-Internal Tools

Description and Scope of Data Processing

GWDG-internal tools such as image generation receive model requests (e.g., “generate an audio output: ‘Hello Data Protection Officer’”), process them, and return the results to the models. At no point are requests or responses (including artifacts such as images) permanently stored; instead, responses and artifacts are immediately returned to the users.

Vector Database / RAG System / Arcana

The GWDG’s Arcana system provides users with a database that makes datasets searchable within ChatAI and uses them as references. The entire system is provided internally by GWDG and consists of a web UI—the RAG Manager—and an integration into ChatAI.

Description and Scope of Data Processing

Generally, a distinction is made between the role of the developer and that of the user. The developer provides contextual data used to build an index server-side. This index is persistent and stored, and can be used across multiple sessions and by multiple users. The index enables users to access a large language model that can leverage specific knowledge from the provided contextual data to answer individual user queries. To do this, the developer uploads data via the RAG Manager, where it is stored and indexed in various datasets called “Arcanas.” An Arcana can be shared with any number of users. Each user must know the name or ID of the Arcana. It is crucial to emphasize that any person with access to an Arcana can access the knowledge it contains.

The contextual data provided by developers is server-side indexed into an Arcana and secured with a password. The Arcana is then imported into the context of open-source models in ChatAI or exported via API, provided the user supplies an ID.

Duration of Storage

The contextual data provided by developers is stored permanently until explicitly deleted by the developers. User requests and responses continue to be stored only locally on the users’ client systems, as described in the section “Use of Self-Hosted Models.” The request exists solely on GWDG’s servers during processing.

External Tools

External tools are used to forward parts of the user request to third-party services. GWDG assumes no liability for the use of external tools!

Description and Scope of Data Processing

The external service provider (e.g., Google) receives the search request initiated by the model and returns the results of the web search as references. For external MCP servers, function arguments are passed. No additional information about users is transmitted, nor are details about user browsers or similar shared. If the model decides to use personally identifiable information entered by users—such as when instructed to search for a person online—this is precisely the intended functionality.

Duration of Storage

Information is not stored by GWDG. However, a search engine or MCP server may store the performed request issued by the LLM.

Rights of data subjects

You have various rights with regard to the processing of your personal data. We list them in the following, but there are also references to the articles (GDPR) and/or paragraphs (BDSG (2018)) which provide even more detailed information.

You may request confirmation from the controller whether we process personal data related to you. This includes the right to obtain access to information as to whether the personal data concerning you is transferred to a third country or to an international organisation.

You have a right of rectification and / or completion vis-à-vis the controller if the personal data processed related to you is inaccurate or incomplete. The controller must perform rectification immediately.

You have the right to request the immediately erase of your personal data from the controller. As an alternative, you may request to restrict the processing from the controller, whereby restrictions are referred to in the GDPR/BDSG under the articles and/or sections mentioned.

If you have asserted the right to rectification, erasure or restriction of processing vis-à-vis the controller, the controller is obligated to communicate such rectification or erasure of the data or restriction of processing to all recipients to whom the personal data concerning you has been disclosed, unless this proves impossible or involves disproportionate effort. You have the right vis-à-vis the controller to be informed about these recipients.

You have the right to receive the personal data concerning you, which you have provided to the controller, in a structured, commonly used and machine-readable format. In addition to the scenarios presented in and provisions of the GDPR, it must be noted that portability of mass data / user data is limited to technical readability. The right to data portability does not include that the data created by the user in a proprietary format is converted by the controller into a commonly used, i.e. standardised format.

You have the right to object to the processing if this is based only on the controller weighing any interests (see Article 6 (1) lit. (f) GDPR). Right to withdraw consents in terms of data protection laws (Article 7 (3) GDPR) You have the right to withdraw your consent under data protection laws at any time. The withdrawal of consent does not affect the lawfulness of processing based on such consent before its withdrawal.

Without prejudice to any other administrative or judicial remedy, you have the right to lodge a complaint with a supervisory authority, in particular in the Member State of your habitual residence, place of work or place of the alleged infringement if you consider that the processing of personal data relating to you infringes the GDPR.

Datenschutzhinweis

Verantwortlich für die Datenverarbeitung

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Burckhardtweg 4
37077 Göttingen
Deutschland
Tel: +49 (0) 551 39-30001
E-Mail: support@gwdg.de
Website: www.gwdg.de

Ansprechpartner / Datenschutzbeauftragter

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Datenschutzbeauftragter
Burckhardtweg 4
37077 Göttingen
Deutschland
Tel: +49 (0) 551 39-30001
E-Mail: support@gwdg.de

Allgemeines zur Datenverarbeitung

Geltungsbereich im Falle individueller Vereinbarungen

Wenn Sie Zugriff auf diesen LLM-Dienst durch ihre Organisation erhalten, gelten die Richtlinien und Datenschutzhinweise Ihres Unternehmens. Im Falle eines Konflikts zwischen diesen Datenschutzbestimmungen und den Bedingungen einer bzw. mehrerer Vereinbarung(en) mit der GWDG, z.B. einem mit der GWDG geschlossenem Auftragsverarbeitungsvertrags, sind stets die Bedingungen dieser Vereinbarung(en) ausschlaggebend. Kardinalpflichten genießen stets Vorrang vor diesen allgemeinen Bestimmungen.

Sie können im Zweifelsfall bei Ihrem Institut in Erfahrung bringen, welche Datenschutzrichtlinien für Sie gelten.

Übersicht über den Service

Der Service ChatAI besteht aus mehreren Komponenten, insbesondere einem Web-Frontend und Large Language Modellen im Backend. Das Frontend bietet Nutzer:Innen ein Webinterface, um Benutzeranfragen direkt mittels Browser eingeben zu können. Weiterhin kann das gewünschte Modell ausgewählt und gewisse Einstellungen können vorgenommen werden. Das Frontend leitet alle Anfragen weiter an das ausgewählte Modell-Backend. Es wird hierbei aus Datenschutzgründen zwischen lokal von der GWDG gehosteten Modellen und externen Modellen von anderen Anbietern unterschieden, wobei die externen Modelle als solche gekennzeichnet sind. Das Backend, wird mittels der GWDG-Plattform SAIA gehostet, es nimmt alle Anfragen entgegen und leitet diese an das entsprechende Modell weiter. Im Falle von externen Modellen, werden die Anfragen, genauer, die im Browser übermittelte Historie inklusive der Zwischentexte der Modelle, sowie etwaige “Memories”, wie von der Nutzer:in getätigt, an den jeweiligen externen Anbieter weitergeleitet. Bei den selbst gehosteten Modellen werden die Anfragen nur auf den Systemen der GWDG verarbeitet. Zusätzlich können vom Frontend oder via API sogenannte Werkzeuge (“GWDG Tools” im Frontend genannt, “Tools” in der OpenAI API) von den Nutzenden aktiviert werden. Werkzeuge greifen in die Anfragen der Nutzenden ein und stellen vielfältig erweiterte Funktionalitäten bereit. Die meisten angebotenen Werkzeuge nutzen von der GWDG bereitstellte Dienste. Manche Funktionalitäten (bspw. Websuche) können jedoch nur durch externe Dienste bereitgestellt werden, diese Funktionalitäten werden dann im Frontend mit einer Datenschutz-Warnung versehen.

Umfang der Verarbeitung personenbezogener Daten

Rechtsgrundlage für die Verarbeitung personenbezogener Daten

Soweit wir für Verarbeitungsvorgänge personenbezogener Daten eine Einwilligung der betroffenen Person einholen, dient Art. 6 Abs. 1 lit. a EU-Datenschutzgrundverordnung (DSGVO) als Rechtsgrundlage. Bei der Verarbeitung von personenbezogenen Daten, die zur Erfüllung eines Vertrages, dessen Vertragspartei die betroffene Person ist, dient Art. 6 Abs. 1 lit. b DSGVO als Rechtsgrundlage. Dies gilt auch für Verarbeitungsvorgänge, die zur Durchführung vorvertraglicher Maßnahmen erforderlich sind. Soweit eine Verarbeitung personenbezogener Daten zur Erfüllung einer rechtlichen Verpflichtung erforderlich ist, der unser Unternehmen unterliegt, dient Art. 6 Abs. 1 lit. c DSGVO als Rechtsgrundlage. Für den Fall, dass lebenswichtige Interessen der betroffenen Person oder einer anderen natürlichen Person eine Verarbeitung personenbezogener Daten erforderlich machen, dient Art. 6 Abs. 1 lit. d DSGVO als Rechtsgrundlage. Ist die Verarbeitung zur Wahrung eines berechtigten Interesses unseres Unternehmens oder eines Dritten erforderlich und überwiegen die Interessen, Grundrechte und Grundfreiheiten des Betroffenen das erstgenannte Interesse nicht, so dient Art. 6 Abs. 1 lit. f DSGVO als Rechtsgrundlage für die Verarbeitung.

Nutzung der Chat-AI Webseite (Frontend)

Beschreibung und Umfang der Datenverarbeitung

Zeitstempel des Zugriffs
Name des auf dem zugreifenden Gerät installierten Betriebssystems (User-Agent)
Name des verwendeten Browsers (User-Agent)
Die IP-Adresse des zugreifenden Geräts

Die Daten werden ebenfalls in den Logfiles unseres Systems gespeichert. Eine Speicherung dieser Daten zusammen mit anderen personenbezogenen Daten des Nutzenden findet nicht statt.

Info

Allgemeine Nutzung von Modellen

Beschreibung und Umfang der Datenverarbeitung

Zu Abrechnungszwecken werden bei jeder Anfrage auf dem Server der GWDG die folgenden Daten abgespeichert und gelogged:

Zeitstempel der Anfrage
NutzerID
Länge der Anfrage und Länge der Antwort

Dauer der Speicherung

Die Abrechnungsdaten werden ein Jahr gespeichert.

Nutzung von GWDG-gehosteten Modellen

Beschreibung und Umfang der Datenverarbeitung

Um die bei der GWDG gehosteten Modelle zu nutzen, werden die Eingaben/Anfragen von den Nutzenden auf den Systemen der GWDG verarbeitet. Der Schutz der Privatsphäre von Nutzeranfragen ist für uns von grundlegender Bedeutung. Aus diesem Grund speichert unser Dienst, in Kombination mit den bei der GWDG gehosteten lokalen Modellen, weder die Inhalte ihrer Anfragen (Chatverlauf) noch werden Aufforderungen oder Antworten zu irgendeinem Zeitpunkt auf einem persistenten Speicher abgelegt.

Dauer der Speicherung

Die Eingaben werden auf dem Server der GWDG nur während der Verarbeitung durch die Large Language Modelle selbst vorgehalten, d.h. während die Daten auf den eigenen Systemen verarbeitet werden.

Nutzung von externen Modellen von OpenAI

Beschreibung und Umfang der Datenverarbeitung

Um die Modelle von OpenAI zu nutzen, senden wir von unserem Server aus die jeweilige Anfrage (Eingabe des Nutzenden) an die Server von Microsoft (externer Dienstleister) weiter. Zur Erfüllung des Services werden folgende Daten weitergeleitet:

Anfrage der Nutzenden

Informationen über die Nutzenden selbst werden von der GWDG nicht weitergegeben. Allerdings wird die Anfrage der Nutzenden ungefiltert weitergegeben, d.h., dass persönliche Informationen, die in der Anfrage selbst enthalten sind, an den externen Dienstleister weitergegeben werden. Der GWDG-Service orientiert sich zwingend am Data Processing Addendum (https://learn.microsoft.com/en-us/legal/cognitive-services/openai/data-privacy, https://www.microsoft.com/licensing/docs/view/Microsoft-Products-and-Services-Data-Protection-Addendum-DPA). Das heißt, es liegt ein Angemessensheitsbeschluss gemäß europäischer Datenschutzgrundverordnung vor, eine Datenübermittlung an Dritte kann aber nicht seitens der GWDG ausgeschlossen werden.

Die durch die GWDG-Server anonymisierten Anfragen an den externen Dienstleister werden gemäß des Microsoft Data Processing Addendums ausschließlich im Falle eines Missbrauchsversuchs, z.B. zum Erstellen von Hass- oder sexualisierten Inhalten, für bis zu 30 Tage geloggt. Dies geschieht automatisch, wenn das Backend feststellt, dass ein Missbrauchsversuch vorliegt. Dabei kann es nicht ausgeschlossen werden, dass legitime Anfragen fälschlicherweise als Missbrauchsversuch gewertet und geloggt werden.

Widerspruchs- und Beseitigungsmöglichkeit

Die Erfassung der Eingabe des Nutzers sowie die Prozessierung durch Microsoft ist für die Bereitstellung der externen Modelle zwingend erforderlich. Es besteht folglich seitens des Nutzenden keine Widerspruchsmöglichkeit.

Nutzung von Partnermodellen aus der Forschung

Beschreibung und Umfang der Datenverarbeitung

Die GWDG hat Forschungspartner, die Modelle extern auf Ihren Rechenressourcen hosten. Dazu leitet die GWDG die entsprechende Nutzeranfrage an den Forschungspartner weiter. Informationen über die Nutzer:Innen selbst werden dabei nicht weitergeleitet. Allerdings werden die Nutzeranfragen ungefiltert weitergeleitet, d.h. personenbezogene Daten, die in der Anfrage selbst enthalten sind, werden zu den Forschungspartnern weitergeleitet. Der Datenverabreitung liegt eine Vereinbarung über die Gemeinsame Verantwortlichkeit zwischen der GWDG und den entsprechenden Forschungspartnern zugrunde. Die entsprechenden Modelle die bei den Forschungspartnern gehosted werden sind im Webinterface von ChatAI durch ein nachgestelltes “(Research Partner)” eindeutig gekennzeichnet. Die Nutzung dieser Modelle erfolgt auf eigenes Risiko.

Nutzung von Modellen am LUIS

Die Leibniz Universität IT Services (LUIS) betreiben im Rahmen des KISSKI-Projekts einen Inferenzcluster, der große Sprachmodelle bereitstellt. Diese Modelle sind im Menü des ChatAI-Dienstes mit LUIS gekennzeichnet.

Beschreibung und Umfang der Datenverarbeitung

Um die am LUIS gehosteten Modelle zu nutzen, werden die Eingaben/Anfragen der Nutzenden auf den Systemen des LUIS verarbeitet. Der Schutz der Privatsphäre von Nutzeranfragen ist für uns von grundlegender Bedeutung. Aus diesem Grund speichert unser Dienst in Kombination mit den am LUIS gehosteten Modellen weder den Chatverlauf noch werden Fragen oder Antworten zu irgendeinem Zeitpunkt auf einem persistenten Speicher abgelegt.

Dauer der Speicherung

Die Eingaben werden auf dem Server des LUIS nur während der Verarbeitung durch die Sprachmodelle selbst vorgehalten, d.h. während eine Antwort auf die Anfrage generiert wird.

Nutzung von Werkzeugen / GWDG-Tools

Werkzeuge erweitern die Fähigkeiten und die Mächtigkeit von ChatAI, bspw. durch eine Internetsuche, eine Vektordatenbank, externe MCP-Server oder die Erstellung von Bildern, Audio etc. In sofern nicht explizit aufgeführt werden die Werkzeuge GWDG-intern bereitstellt. Websuche und MCP-Server Dienste sind externe Dienste. Werkzeuge müssen via Opt-In von Nutzenden im Frontend ausgewählt oder über die API angefordert werden. Die von der GWDG zur Verfügung gestellten Werkzeuge werden typischerweise dem ausgewählten Modell bekannt gemacht, die Modelle entscheiden dann basierend auf der Nutzeranfrage ob sie ein oder mehrere Werkzeuge nutzen möchten - bspw. wenn der Nutzende das Modell auffordert ein Bild zu generieren. In dem Fall werden die Werkzeuge mit Parametern aufgerufen, welche von der Nutzeranfrage abhängen. Die Vektordatenbank (Arcana) ist ein Sonderfall. Alle Aktivitäten der Werkzeuge werden transparent in Chat-AI für die Nutzenden dargestellt.

GWDG-interne Werkzeuge

Beschreibung und Umfang der Datenverarbeitung

GWDG interne Werkzeuge wie Bilderzeugung bekommen die Anfrage der Modelle übermittelt (bspw. ’erstelle eine Audioausgabe “Hallo Datenschützer”’), verarbeiten diese und stellen das Ergebnis den Modellen zur Verfügung. Zu keinem Zeitpunkt werden Anfragen oder Antworten (inkl. Artefakte wie Bilder) dauerhaft gespeichert sondern Antworten und Artefakte direkt an Nutzende zurück gegeben.

Vektordatenbank / RAG System / Arcana

Das Arcana-System der GWDG stellt für die Nutzenden eine Datenbank zur Verfügung die Datensätze in ChatAI durchsuchbar macht und als Referenzen nutzt. Das gesamte System wird GWDG-intern zur Verfügung gestellt, es besteht aus einer WebUI - dem RAGmanager und einer Integration in ChatAI.

Beschreibung und Umfang der Datenverarbeitung

Grundsätzlich wird zwischen der Rolle der Entwickler:In und der Nutzer:In unterschieden. Die Entwickler:In stellt Kontextdaten bereit, die genutzt werden um serverseitig einen Index zu bauen. Dieser Index ist persistent/gespeichert und kann über mehrere Sessions und von mehreren Nutzer:Innen genutzt werden. Der Index wird genutzt, um Nutzer:Innen Zugriff auf ein Large Language Model zu geben, welches spezifisches Wissen aus den bereitgestellten Kontextdaten nutzen kann um die individuellen Anfragen der Nutzenden zu beantworten. Der Entwickler:In lädt dazu die Daten via des RAGManagers hoch, wo diese in verschiedenen Datensätzen, Arcanas genannt, vorgehalten und indiziert werden. Ein Arcana kann mit beliebig vielen Personen geteilt werden. Dabei muss jede:r Nutzer:In den Namen des Arcanas, bzw. dessen ID kennen. Dabei ist wichtig klarzustellen, dass jede Person die Zugriff das Arcana hat, sich Zugriff auf das enthaltene Wissen verschaffen kann.

Die von den Entwickler:Innen bereitgestellten Kontextdaten werden serverseitig zu einem Arcana indiziert und mit einem Passwort gesichert. Das Arcana wird dann, falls eine ID von den Nutzenden bereitgestellt wurde, im Kontext der Open-Source Modelle in Chat AI oder via API exportiert.

Dauer der Speicherung

Externe Werkzeuge

Bei externen Werkzeugen ist der Sinn und Zweck Teile der Benutzer-Anfrage weiterzureichen. Die GWDG kann keine Haftung bei der Verwendung von externen Werkzeugen übernehmen!

Beschreibung und Umfang der Datenverarbeitung

Der externe Dienstleister (bspw. Google) bekommt die von dem Model angeforderte Suchanfrage übermittelt und die Ergebnisse der Websuche werden als Referenzen dargestellt. Bei externen MCP-Servern werden die Funktionsargumente übergeben. Es werden keine zusätzlichen Informationen über die Nutzenden übermittelt, ebenfalls werden keine Informationen über Benutzer-Browser etc. weitergegeben. Sollte das Modell sich entscheiden von den Nutzenden eingegebene personenbezogene Daten zu verwenden, bspw. indem es aufgefordert wird im Internet nach einer Person zu suchen, so ist dies exakt die beabsichtigte Funktionalität.

Dauer der Speicherung

Die Informationen werden bei der GWDG nicht gespeichert, eine Suchmaschine oder MCP-Server kann die vorgenommene Anfrage speichern.

Rechte der betroffenen Personen

Auskunftsrecht (DSGVO Art. 15, BDSG §34)

Sie können von dem Verantwortlichen eine Bestätigung darüber verlangen, ob personenbezogene Daten, die Sie betreffen, von uns verarbeitet werden. Dies schließt das Recht ein, Auskunft darüber zu verlangen, ob die betreffenden personenbezogenen Daten in ein Drittland oder an eine internationale Organisation übermittelt werden.

Recht auf Berichtigung (DSGVO Art. 16)

Recht auf Löschung / „Recht auf Vergessen werden“ / Recht auf Einschränkung der Verarbeitung (DSGVO Art. 17, 18, BDSG §35)

Recht auf Unterrichtung (DSGVO Art. 19)

Recht auf Datenübertragbarkeit (DSGVO Art. 20)

Widerspruchsrecht (DSGVO Art. 21, BDSG §36)

Sie haben das Recht, Widerspruch gegen die Verarbeitung einzulegen, wenn diese ausschließlich auf Basis einer Abwägung des Verantwortlichen geschieht (vgl. DSGVO Art. 6 Abs. 1 lit f).

Recht auf Widerruf der datenschutzrechtlichen Einwilligungserklärung (DSGVO Art. 7 Abs. 3)

Recht auf Beschwerde bei einer Aufsichtsbehörde (DSGVO Art. 77)

Terms of use

§1 Allgemeine Bestimmungen

Es gelten die (AGB der Acacemic Cloud)[https://academiccloud.de/terms-of-use/].

§2 Registrierung und Zugriff

Der Zugriff auf diesen Dienst erfordert eine AcademicCloud-ID. Die Nutzung einer AcademicCloud-ID unterliegt der Annahme der Nutzungsbedingungen der “Academic Cloud”.

§3 Autorisierte Nutzung

§4 Entwicklung

§5 Updates

Die GWDG wird durch regelmäßge Updates, die innerhalb und außerhalb des dedizierten Wartungsfensters der GWDG innerhalb vertratbarer Fristen stattfinden kann, den Service aktuell halten.

§6 Verbote

(1) Benutzer sind es untersagt, diesen Dienst zur Übertragung, Erzeugung und Verbreitung von Inhalten (Eingabe und Ausgabe) zu verwenden, die:

Kinderpornographie oder sexuellen Missbrauch darstellen, auch die Fälschung, Täuschung oder Nachahmung desselben;
sexuell explizit sind und für nicht-bildende oder nicht-wissenschaftliche Zwecke eingesetzt werden;
diskriminierend sind, Gewalt, Hassreden oder illegale Aktivitäten fördern;
Datenschutzgesetze verletzen, einschließlich der Sammlung oder Verbreitung personenbezogener Daten ohne Zustimmung;
betrügerisch, irreführend, schädlich oder täuschend sind;
Selbstverletzung, Belästigung, Mobbing, Gewalt und Terrorismus fördern;
illegale Aktivitäten fördern oder geistige Eigentumsrechte und andere rechtliche und ethische Grenzen im Online-Verhalten verletzen;
versuchen, unsere Sicherheitsmaßnahmen zu umgehen oder Handlungen zu veranlassen, die etablierte Richtlinien vorsätzlich verletzen;
Einzelpersonen, insbesondere in Bezug auf sensible oder geschützte Merkmale, ungerechtfertigt oder nachteilig beeinflussen könnten;

(2) Benutzern ist es untersagt, folgende Aktivitäten durchzuführen:

Reverse-Engineering, Dekompilierung oder Disassemblierung der Technologie;
Unautorisierte Aktivitäten wie Spamming, Malware-Verbreitung oder störende Verhaltensweisen, die die Dienstqualität beeinträchtigen;
Modifizierung, Kopie, Vermietung, Verkauf oder Verbreitung unseres Dienstes;
Nachverfolgung oder Überwachung von Einzelpersonen ohne deren ausdrückliche Zustimmung.

vertrauliche oder sensible Informationen enthalten;
sensible oder kontrollierte Daten beinhalten, etwa besonders geschützte Daten wie in Artikel 9(1) DSGVO gelistet;
oder Forschung mit menschlichen Probanden;

(4) Wenn Sie diesen Dienst im Auftrag einer Organisation und nicht als Privatperson nutzen, so sollte ein Auftragsverarbeitungsvertrag zwischen ihrer Organisation und der GWDG geschlossen werden. Sollten Unsicherheiten bzgl. Sicherheit oder Datenschutz des Dienstes vorhanden sein, so bitten wir um Kontakt des Datenschutzbeauftragen bei dem Mailpostfach support@gwdg.de mit dem Titel “Datenschutz ChatAI”.

(5) Für Forschungszwecke könnten in bestimmten Fällen die Nutzungsszenarien in (1) gestattet sein. Hierbei müssen für den Einsatszweck begrenzten Fälle schriftliche Absprachen zwischen den Nutzenden und der GWDG getroffen werden.

(6) Insbesondere bei der Nutzung der serverseitigen RAG Systems dürfen in die Arcanas keine Kontextdaten hochgeladen werden, die unter (1) und (3) aufgeführt sind. Falls ein gültiger Rechtsrahmen besteht, der eine solche Nutzung zulässt, so haben individuelle Vertragsabreden vorrang.

§7 Beendigung und Aussetzung

§8 Korrektheit der Ergebnisse

§9 Haftungsbeschränkung

(1) Allgemeine Haftungsbeschränkung: Die GWDG übernimmt keinerlei Haftung für Schadensersatzansprüche von Nutzer:Innen basierend auf der Inanspruchnahme des Services “Chat AI”. Die hier dargelegte und in den weiteren Abschnitten genauer erklärte Haftungsbeschränkung leitet sich daraus ab, dass die GWDG ausschließlich eine Platform zur Nutzung von Sprachmodellen bereitstellt. Die GWDG kann auf dieser Platform keinerlei technische Maßnahmen bereit stellen, um in die durch die Sprachmodelle generierten Antworten dergestalt einzugreifen, dass ein Schaden bei unsachgemäßer Nutzung durch die Nutzenden ausgeschlossen werden kann. Daher verbleibt die vollständige Haftung für Schadensersatzansprüche durch unsachgemäße Nutzung dieser Platform bei den Nutzenden. Hiervon ausgenommen sind Ansprüche basierend auf der Verletzung von Leben, Körper, Gesundheit, durch grobem Verschulden oder durch vorsätzliche oder grob fahrlässige Pflichtverletzung. Ebenfalls ist die Verletzung von Kardinalpflichten vom grundsätzlichem Haftungsausschluss ausgeschlossen.

(2) Urheberrecht: Die Nutzer:Innen des Service “Chat AI” haben die vollständige und alleinige Verantwortung die geltenden Bestimmmungen des Urheberrechts zu beachten und einzuhalten. Die GWDG weist die Nutzer:Innen explizit darauf hin, dass die bereitgestellten Sprachmodelle von Dritten trainiert wurden und der GWDG kein Erklärung vorliegt, die die verwendeten Materialien auf freie Lizenzen einschränkt. Es kann folglich von der GWDG nicht ausgeschlossen werden, dass die bereitgestellten Sprachmodelle mithilfe urheberrechtlich geschützter Inhalte trainiert wurden. Antworten, die die Sprachmodelle den Nutzer:Innen geben, können folglich urheberechtlich geschützte Inhalte beinhalten. Die GWDG weißt explizit die Nutzenden darauf hin, dass ein direktes Weiterverwenden von erhaltenen Antworten nicht empfohlen ist. Die Prüfung des Urheberrechts für solche Fälle liegt alleine bei den Nutzer:Innen. Die GWDG übernimmt keinerlei Hauftung für etwaige Schadensersatzansprüche aus Urheberrechtsverletzungen.

(3) Vertraulichen Informationen: Wir können keine Haftung für Verlust/Veröffentlichung von Daten übernehmen, die die Nutzenden in Ihren Anfragen bereitstellen. Ausgenommen ist hiervon ein grobfahrlässiges Handeln sowie Kardinalpflichten.

(4) Patentrecht: Die Nutzer:Innen des Service “Chat AI” haben die vollständige und alleinige Verantwortung die geltenden Bestimmmungen des Patentrechts zu beachten und einzuhalten. Antworten der bereitgestellten Sprachmodelle können in ihrer konzeptionellen Idee patentgeschützte Inhalte beinhalten. Die GWDG weißt explizit die Nutzenden darauf hin, dass ein direktes Weiterverwenden von Ideen und Konzepten die in den erhaltenen Antworten vermittelt werden, nicht empfohlen ist. Die Verantowrtung zur Prüfung des Patentschutzes für solche Fälle liegt alleine bei den Nutzer:Innen. Die GWDG übernimmt keinerlei Hauftung für etwaige Schadensersatzansprüche aus Patentverletzungen.

(5) Fehlinformationen: Die GWDG weist die Nutzer:Innen des Service “Chat AI” darauf hin, dass es eine intrinsische Eigenschaft der bereitgestellten Sprachmodelle ist, sich Inhalte frei auszudenken - dies firmiert bei Sprachmodellen unter dem Stichwort “Halluzination”. Die Informationen, die in den Antworten enthalten sind können veraltet, frei erfunden, unpassend, aus dem Kontext genommen, oder falsch sein. Dies stellt keine Fehlfunktionen der bereitgestellten Platform dar, da dies technisch von dem Werkzeug der Sprachmodelle zu erwarten ist. Die unabhängige und kritische Überprüfung der erhaltenen Informationen obliegt einzig und alleine der Nutzer:Innen. Die GWDG übernimmt keinerlei Haftung für die Informationen, die in den Antworten der Sprachmodelle steckt. Weitere Information hierzu in “§8 Genauigkeit”.

§10 Dienstleistungen von Drittanbietern

§11 Feedback

§12 Datenschutz

Die Privatsphäre der Nutzeranfragen ist für uns von grundlegender Bedeutung. Weitere Information finden Sie in der (Datenschutzerklärung) [https://datenschutz.gwdg.de/services/chatai].

§13 Schlussbestimmungen

Features

This section collects all functionality that extends and customizes Chat AI beyond basic text generation.
Features let you shape the assistant’s personality, integrate external knowledge, and configure model behavior for different use cases.
The Version history is also tracked here, showing which features were added or updated in each release.

Available Features

Memory Memories are pieces of information that the chatbot can learn from conversation to behave in a more personalized way.
Personas
Define roles and tones for the assistant (e.g., interviewer, tutor, casual style).
Personas make it easy to quickly switch the model’s behavior.
Arcana / Retrieval-Augmented Generation (RAG)
Connect your own data sources (documents, notes, datasets) to ground responses in factual context.
Tools Chat AI models are given access to a variety of tools to accomplish non text-based tasks or improve responses.
- Web search
- Image Generation
- Image Modification
- Text to speech (tts)
MCP Support You can add custom public Model Context Protocol (MCP) servers to Chat AI.

Personas

Chat AI supports loading preset personas from configurations in the form of JSON files. Each JSON file includes the system prompt, settings, and conversations, allowing you to easily load a persona into Chat AI. While these files can be imported using the import function, Chat AI also supports directly importing public JSON files from the web, by specifying it in the URL.

Example personas

If all you need is a quick link to load a specific persona, this is your chapter. These are some of the interesting and useful personas the AI community came up with:

Assistant Persona: useful for general-purpose queries
GWDG Support: answers questions about GWDG services
German to English Translation: translates given German text to English without commentary
English to German Translation: translates given English text to German without commentary
Annoying Monster: An annoying monster that speaks in riddles.
High Society Snob: A high society snob that will only respect you if you are royalty.
10-year-old Child: A ten-year-old child.

Info

This is where we need your help!

Check out our Chat AI Personas GitHub repository and help us creating highly versatile and useful personas. The best ones will be featured on this page.

Using Chat AI Personas

We provide some recommended personas in our Chat AI Personas GitHub repository.

You can create a link to Chat AI with the desired persona from the publicly-available preset personas in Chat AI. To do this, simply add the URL of the JSON file in the import parameter of the URL:

https://chat-ai.academiccloud.de/chat?import=<json_url>

Replace <json_url> with the URL to the JSON file.

Custom Personas

You can also create your own custom personas to load directly in the Chat AI interface. These must be saved as a JSON file.

{
    "title": "Sample Persona", # 
    "model-name": "Qwen 3 30B A3B Instruct 2507", # The model name that is displayed in the UI
    "model": "qwen3-30b-a3b-instruct-2507", # model id in the API
    "temperature": 0.2, # Custom temperature setting
    "top_p": 0.2, # Custom top_p setting
    # This is where you can change your system prompt. Role: tells it what the role is in this case "system" other options are "user", "assistant", and "info". System should always be first and followed optionally by an info message. Content is where you enter your custom prompt.
    "messages": [
      {
        "role": "system",
        "content": "<Enter your custom prompt here>"
      },
      {
        "role": "info",
        "content": "< (Optional) Enter an info message to be displayed at the top of the conversation >"
      }
    ]
  }

The latest models are listed here. The API model name is listed here

Info

Note that by clicking on these links the persona’s configuration, i.e., system prompt, model, and other settings will be loaded.

Model Context Protocol (MCP)

Chat AI supports adding public Model Context Protocol (MCP) servers as tool providers to your Chat AI experience.

This tool requires a model context protocol server URL, which should be a simple HTTPS address. In general this allows Chat AI to interact with additional tools, data sources, or further processing capabilities beyond what is built into Chat AI. Any data from your Chat AI context may be sent to the server you entered.

Interacting with up-to-date information is particularly useful for processing, as it provides more targeted results compared to web searches and is less static than RAG systems like Arcana. Furthermore, additional tools can be used like this example:

MCP Server: https://mcp.deepwiki.com/mcp
Prompt: Explain what this Github Repo: https://github.com/gwdg/chat-ai is about

Adding MCP servers running on your computer, i.e. http://localhost is not supported.

Tools

Overview

This document describes a custom multimodal tool server hosted on GWDG infrastructure. It provides core AI capabilities: image generation, image editing, text-to-speech (TTS), and web search, accessible directly through the Chat AI UI. These tools are designed to enrich user interaction by enabling dynamic media creation and transformation within conversational workflows.

Prerequisites

To use the tool server, the following conditions must be met:

Default LLM: The system should ideally run Qwen3-30B-A3B-Instruct-2507 as the active language model for optimal performance.
Tool Activation: Tools must be enabled in the Chat AI UI by checking the “Enable Tools” box in the settings panel.

Web Search is enabled separately, because it can result in data being sent to external service providers. To enable it, check the “GWDG Tools” and “Web Search” checkboxes in the sidebar as shown below.

Once activated, the agent can discover and invoke tools based on user intent.

Available Tools

Tool Name	Description
`generate_image`	Generates images from text prompts using the FLUX.1-schnell model
`edit_image`	Applies edits to existing images (e.g., inpainting, masking, style transfer) using Qwen-Image-Edit
`speak_text`	Converts text to speech using the XTTSv2 model
`web_search_preview`	Uses a web search provider, such as Google, to provide additional information on an LLM provided query

Web Search

The web search tool allows the AI to look up the latest information from the internet to improve its responses. When enabled, the AI can generate search queries based on your question and the full conversation history, send them to a search engine (such as Google), and use the retrieved results to provide more accurate and up-to-date answers. This is especially useful for topics where current or rapidly changing information is important. Web Search is not available for externally hosted models. You may need to explicitly ask the model to search the web for it to make such a tool call.

Usage Flow

Once tools are enabled, the agent follows a structured flow to interpret user input and invoke the appropriate tool:

Tool Discovery
The agent lists available tools and their capabilities.
Example: “What tools can I use?” → Agent responds with generate_image, edit_image, speak_text.

Tool Selection
Based on user intent, the agent selects the relevant tool.
Example: “Make an image of a glowing jellyfish in deep space” → Agent selects generate_image.

Invocation
The agent sends a structured input payload to the tool server.
Example:

{
  "prompt": "a glowing jellyfish floating in deep space",
  "size": "1024x1024"
}

Response Handling
The agent receives the output and renders it in the UI or stores it for further use.

More Example
- “Change this image to Van Gogh style” → edit_image → applies style transfer

“Make audio from this text: ‘Welcome to GWDG. Your research matters.’” → speak_text → plays audio

Versions

This page lists all Chat AI releases, starting with the newest.
Each entry describes new features, improvements, and fixes, with short explanations for how to use them.

v0.9.0 — September 2025

New Features

Redesigned UI
A fresh interface with collapsible left and right sidebars for maximum chatting space.
Optimized for handling arbitrarily large conversations and attachments smoothly.
New Model Selector
Now located in the top center. Displays many more models compared to before.
Default model switched to Qwen 3 30B A3B Instruct 2507, chosen for speed, tool compatibility, and strong performance.
GWDG Tools Integration
Enable GWDG tools in the settings panel to unlock new capabilities:
- Web search with Google
- Image generation and modification
- Speech generation (e.g., text-to-speech)
- Arcana/RAG with any model
- Custom MCP Server: specify the URL of any MCP server to access its tools in addition to GWDG tools.
  Chat AI displays real-time updates as tools are used.
  ⚠️ Tools are a new feature and may not yet work with all models. Recommended with Qwen 3 30B A3B Instruct 2507.
Export Data
From your profile → settings, you can now backup all your data and save it as a single JSON file.

Improvements

Better code sanitization to prevent cross-site styling.
Smoother auto-scrolling in long conversations.
New navigation menu to other AI services.
Unified attach button for consistency.
Numerous small UI/UX fixes and optimizations.

v0.8.1 — July 3, 2025

New Features

Memory
Chat AI can now remember relevant details across all conversations.
⚠️ Memory is stored locally in the browser only (not on servers).
This allows more natural ongoing conversations, but clearing browser storage resets it.
Model search in selection menu
Quickly find models by typing their name.
Configurable global timeout
Prevents endless-loop responses by setting a max time per response.
LaTeX rendering option
In addition to Markdown and Plaintext, responses can now render LaTeX properly.
Default settings in config file
Define startup defaults (e.g., temperature, theme) in configuration.

Fixes

Fixed rendering bug in last line of text before references.
Generated arcana links now open in a new conversation.

v0.8.0 — June 4, 2025

New Features

Selectable Personas in UI
Load personas directly in the interface from chat-ai-personas.
Personas let you quickly change the assistant’s role.
Info messages in imported conversations/personas are now supported.
Standalone UI mode
Simplified setup compatible with SAIA API key.

Improvements

Better import/export of conversations.

v0.7.4 — May 19, 2025

Fixes

Fixed syncing issues when multiple tabs are open.
Updated URL parsing for arcanas.

Improvements

References can now be rendered as links.

v0.7.3 — April 24, 2025

New Features

Sidebar settings panel replaces old settings popup.
Attachments as boxes with thumbnails in prompts.
Expanded attachment support for more file types.
Version number displayed in footer.

Changes

Removed key field from arcana settings.

v0.7.2 — April 15, 2025

New Features

Improved attachments display with resend, edit, undo support.
Added support for more file types.
Updated references format to align with RAG requirements.
Choose response rendering mode (Markdown, LaTeX, Plaintext).
Download responses as PDF.

Fixes

LaTeX, code, and Markdown rendering bugs fixed.
UI scrollbar issue fixed.
Multiple window/tab stability improved.

v0.7.1 — Feb 26, 2025

New Features

Video input support.
PDF processing via docling.
Image & text attachment previews.
Edit button for responses.

Improvements

Updated logo file.
Improved markdown/LaTeX rendering.
Smoother response display.
Better support for multiple conversations.

v0.7.0 — Feb 26, 2025

New Features

Arcanas supported.
Multiple conversation support.
Profile window added.
Scale-to-zero models supported.
Model status popups.
Code copy button for responses.

Fixes

Retry button path fixed.
LaTeX and Markdown handling improved.
Minor UI fixes.

v0.6.3 — Feb 26, 2025

New Features

No token limit – unlimited tokens.
Image upload – via clipboard or drag-and-drop.
CSV upload – supported directly.
Temporary model execution – run inactive models briefly.
Model status indicators (active, loading, etc.).

v0.6.2 — Feb 26, 2025

Updated models API endpoint to /models.

v0.6.1 — Feb 26, 2025

New Features

Share model & settings via base64-encoded URL.
Import external settings (e.g., personas).

UI Updates

More visible scrollbar in model selection.
Fixed header on tablets.
Cleaner design in options section.

CoCo AI

CoCo AI is our code completion service utilizing Chat AI. To use it you need a SAIA API Key. Many code editors feature LLM integration these days. We provide documentation for using the Continue extension for Visual Studio Code (VSCode), as well as for the Zed editor.

Continue

Setup

For all below commands, Ctrl can be substituted with Cmd for Mac users.

Visual Studio Code (or Jetbrains) is required as your IDE (Integrated Development Environment) to use CoCo AI. Install the continue.dev extension from the extension marketplace in your VSCode. Continue is an open-source AI code assistant plugin that can query code snippets or even entire repositories for a chosen model. Continue provides a short introduction to their product upon installation which is easy to follow. Continue will create a directory .continue in your home folder and, since v1, looks first for config.yaml. If you do not have this directory or the file, create it. Open config.yaml in an editor of your choice and paste the following YAML:

name: Chat-AI
version: 0.0.1
schema: v1

models:
  - name: Meta Llama 3.1 8B Instruct
    provider: openai
    model: meta-llama-3.1-8b-instruct
    apiBase: https://chat-ai.academiccloud.de/v1
    apiKey: "<api-key>"
    roles:
      - chat          
      - summarize    

  - name: Meta Llama 3.3 70B Instruct
    provider: openai
    model: llama-3.3-70b-instruct
    apiBase: https://chat-ai.academiccloud.de/v1
    apiKey: "<api-key>"
    roles:
      - chat          
      - apply         
      - edit         

  - name: Codestral-22B
    provider: openai
    model: codestral-22b
    apiBase: https://chat-ai.academiccloud.de/v1
    apiKey: "<api-key>"
    defaultCompletionOptions:
      temperature: 0.2
      topP: 0.1
    roles:
      - autocomplete  
      - chat          
      - edit

Note: roles tells Continue what each model may be used for (chat, autocomplete, edit, etc.). At least one model must advertise autocomplete if you want the Tab-completion feature later.

Note that only a subset of all models available are included above. Furthermore, the openAI GPT 3.5 and 4 models are not available for API usage, and thus not available for CoCo AI. Other available models can also be included as above. Make sure to replace <api_key> with your own API key (see here for API key request). All available models are:

“meta-llama-3.1-8b-instruct”
“meta-llama-3.3-70b-instruct”
“llama-3.1-sauerkrautlm-70b-instruct”
“codestral-22b”
“qwen2.5-coder-32b-instruct”

To access your data stored on the cluster from VSCode, see our Configuring SSH page or this GWDG news post for instructions. This is not required for local code.

Basic configuration

Two important concepts to understand among completion options is temperature and top_P sampling.

temperature is a slider from 0 to 2 adjusting the creativity, with closer to 0 being more predictable and closer to 2 being more creative. It does this by expanding or flattening the probabilities of the next token (response building block).
top_p is a slider from 0 to 1 which adjusts the total population of probabilities considered for the next token. A top_p of 0.1 would only mean the top 10 percent of cumulative probabilities is considered. Variating top_p has a similar effect on predictability and creativity as temperature, with larger values considered to increase creativity.

Predictable results, such as for coding, require low values for both parameters, and creative results, such as for brainstorming, require high values. See the table in the current models section for value suggestions.

Our suggestion is to set the above completion options for each model according to the table in Chat AI and switch between the models based on your needs. You can also store the model multiple times with different completion options and different names to refer to, such as below.

   - name: Creative writing model
    provider: openai
    model: meta-llama-3.3-70b-instruct
    apiBase: https://chat-ai.academiccloud.de/v1
    apiKey: "<api_key>"
    defaultCompletionOptions:
      temperature: 0.7
      topP: 0.8
    roles: [chat]

  - name: Accurate code model
    provider: openai
    model: codestral-22b
    apiBase: https://chat-ai.academiccloud.de/v1
    apiKey: "<api_key>"
    defaultCompletionOptions:
      temperature: 0.2
      topP: 0.1
    roles: [chat, autocomplete]

  - name: Exploratory code model
    provider: openai
    model: codestral-22b
    apiBase: https://chat-ai.academiccloud.de/v1
    apiKey: "<api_key>"
    defaultCompletionOptions:
      temperature: 0.6
      topP: 0.7
    roles: [chat, autocomplete]

Another completion option to consider setting, particularly for long responses, is max_tokens. It is a value less than the context-window that specifies how many tokens may be considered per prompt, plus generated for the response to that prompt. Each model has a different context-window size (see the table in current models for sizes). Similarly, each model has a default max_tokens length. This length is optimal for most tasks, but could be changed for longer tasks, such as “Name the capital of each country in the world and one interesting aspect about it”.

The context-window wants to look at system info, chat history, its training memory, the last prompt and the previous tokens from the current response. Therefore max_tokens limits the response generation in order not to risk degenerating the quality of the response by allocating context away from the other context sources. This is why it is recommended to split a large task into smaller tasks for the requirement of smaller response generation. It could be that this is difficult or unachievable however, in which case the max_tokens would be preferred to be increased (with the risk of degradation). See API Use Cases for an example of how to change max_tokens.

Further configuration options can be found at the Continue configuration page.

Functionality

The three main abilities of the Continue plugin is to analyse code, generate code and resolve errors. A new useful ability is a tab autocomplete key shortcut that is still in Beta. All below code examples use Codestral.

Analyse code

Highlight a code snippet and press the command Ctrl+L. This will open the Continue side bar with the snippet as context for the question of your choice. From this side bar you can also access any file in your repository, as well as provide different types of context, such as entire package or language documentations, problems, git, terminal, or even your entire codebase in your chosen repository. This can be done either by pressing @ or clicking the + Add Context button. Typical functionality is provided, such as opening multiple sessions, retrieving previous sessions and toggling full screen. Models can be changed easily to, say, a model with a creative configuration, to which prompts without context can be sent, the same way the web interface of Chat AI works.

Generating code

Highlight a code snippet and press the command Ctrl+I. This will open a dropdown bar where your prompt of choice about this code can entered. When entered, this will generate further code based on the code snippet, or edit the code snippet itself. These edits can range from correcting faulty code, generating in-line documentation, renaming functions, etc. The generated code and potentially deleted code will show in a format reminiscent of a git merge conflict with Accept and Reject options. Bear in mind that there is no clear indication within VSCode whether cluster resources are available for code generation or whether code generation is not being performed actually for some other reason. We suggest to wait a short moment before trying again.

Before code generation with prompt:

After code generation:

Notice from the example that the code completion model is capable of more than just generating what seems like functional code. It also has all the benefits and knowledge that is expected from an LLM: semantics, grouping and linguistic reasoning. There are still limitations to this knowledge based on the date until which model training was performed, which for the most models these days is at least the year 2021.

Resolve errors

If errors have been encountered in your VS Code Problems,Output or Terminal, press the command Ctrl+Shift+R to place the errors in-context in the Continue side bar and prompt a solution for them. The results of this would explain the errors in great detail and possibly provide solution code for the identified faulty code. The same could be done manually from the Continue sidebar by providing the error as context and requesting it to be fixed.

Tab Autocomplete

Continue repetitively analyses the other code in your current file, regardless of programming language, and provides suggestions for code to fill in. To enable this function, ensure that at least one model in config.yaml includes roles: [autocomplete]. A common pattern is to duplicate the Codestral entry:

- name: GWDG Code Completion
    provider: openai
    model: codestral-22b
    apiBase: https://chat-ai.academiccloud.de/v1
    apiKey: "<api_key>"
    defaultCompletionOptions:
      temperature: 0.2
      topP: 0.1
    roles: [autocomplete]

If the model selected is not a model particularly well-trained for code completion, Continue will prompt you accordingly. Now you should receive code suggestions from the selected model and be able to insert the suggested code simply by pressing Tab, much like the functionality of the default code suggestions VS Code provides when inspecting the packages loaded. Both suggestions could appear simultaneously, for which pressing Tab would prioritise the VS Code functionality over Continue. It may happen also that there is a conflict of hotkeys between tabAutocomplete and tab spacing, in which case the tab spacing hotkey needs to be disabled or remapped in your VS Code settings. In Settings, go to Keyboard Shortcuts and search ’tab’, then disable or replace the keybinding of the tab command. You can disable tabAutocomplete with the commands Ctrl+K Ctrl+A. Unfortunately there is no way to change the keybind of tabAutocomplete.

It is also possible to step through an autocompletion suggestion word-by-word by typing Ctrl + RightArrow. Note that Ctrl + LeftArrow does NOT undo any steps. The below code example was almost entirely generated with tabAutocomplete only from initally typing def plus, beside the need for correcting some indentation.

Partial Tab Autocompletion

Full Tab Autocompletion

More information about tabAutocomplete, including further configuration options, can be found at the Continue documentation.

Zed

Zed is a popular VSCode competitor with builtin AI integration. Since the Chat AI API is OpenAI compatible, we follow Zed’s documentation on that. Your settings.json should look similar to the following:

{
  "language_models": {
    "openai": {
      "api_url": "https://chat-ai.academiccloud.de/v1",
      "available_models": [
        { "name": "qwen2.5-coder-32b-instruct", "max_tokens": 128000 },
        { "name": "qwq-32b", "max_tokens": 131000 },
        { "name": "llama-3.3-70b-instruct", "max_tokens": 128000 }
      ],
      "version": "1"
    }
  }

The model names are taken from here, the context sizes from here. Your API Key is configured via the UI. In the command palette open agent: open configuration. Set your API Key in the dialog for “OpenAI”. For Zed’s AI editing functionality, check out their documentation.

MCP

The Model Context Protocol (MCP) is a common interface for LLM interfaces to call tools and receive additional context. Zed has builtin support for running MCP servers and letting LLMs call the exposed tools via the OpenAI API tool call requests automatically. Here is an example configuration to add a local MCP server to get you started:

{
  "context_servers": {
    "tool-server": {
      "command": {
        "path": "~/dev/tool-server/tool-server",
        "args": [
          "--transport",
          "stdio"
        ],
        "env": null
      },
      "settings": {}
    }
  }
}

Cline

Cline is an open-source coding agent that combines large-language-model reasoning with practical developer workflows. We outline Cline’s main benefits, explains its Plan → Act interface, and walk through an installation that connects Cline to the AcademicCloud (CoCo AI) models.

The Plan → Act Loop

Plan mode: We can describe a goal, such as “add OAuth2 login”. Cline replies with a numbered plan outlining file edits and commands.
Review: Edit the checklist or ask Cline to refine it. Nothing changes in the workspace until we approve.
Act mode: Cline executes each step: editing files, running commands, and showing differences. We confirm or reject actions in real time.

This separation gives the agent autonomy without removing human oversight.

Emacs

Emacs is an extensible, customizable, free/libre text editor — and more.

With the help of gptel, a simple Large Language Model client, we make use of LLMs from within Emacs. gptel is available on MELPA and NonGNU-devel ELPA.

As SAIA implements the OpenAI API standard, the configuration is straight forward.

(setq gptel-model 'qwen3-30b-a3b-instruct-2507
      gptel-backend
      (gptel-make-openai "gwdg"
        :host "chat-ai.academiccloud.de"
        :endpoint "/v1/chat/completions"
        :stream t
        :key gptel-api-key
        :models '(meta-llama-3.1-8b-instruct
                  openai-gpt-oss-120b
                  qwen3-235b-a22b
                  qwen2.5-coder-32b-instruct
                  qwen3-30b-a3b-instruct-2507
         )
)

The SAIA API key is stored in ~/.authinfo.

machine chat-ai.academiccloud.de login apikey password <api_key>

Now you can interact with LLMs from within Emacs.

Installation Guide (VS Code)

Please find the installation steps below:

Prerequisites
- Visual Studio Code (v1.93 or newer)
- AcademicCloud API key
- Node 18+ for optional CLI use
Extension installation
- Search Cline in VScode marketplace and install it.
Connecting to CoCo AI
- Open Cline (Command Palette → “Cline: Open in New Tab”).
- Click the Setup with own API Key and choose “OpenAI Compatible”.
- Fill the fields:
  Field Value
  Base URL https://chat-ai.academiccloud.de/v1
  API Key your AcademicCloud key
  Model ID codestral-22b (add others as needed)
- Add additional models (e.g., meta-llama-3.3-70b-instruct) with the same URL and key if required.
- Assign roles (if we want different model for plan and act): For example, set Codestral for Act; set Llama for Plan.

Daily Workflow

Here is the daily workflow:

Plan → Approve plan → Act → Review differences → Iterate

Cline bridges the gap between chat-based assistants and full IDE automation. With a short setup that points to CoCo AI, it becomes a flexible co-developer for complex codebases while preserving developer’s control.

Data Pool

The GWDG data pool consists of datasets that are relevant to different user groups over a longer period of time. This includes datasets that are to be shared within a working group or with external parties. The data pool concept is a well-known and simple strategy for organising the sharing of curated datasets. A good example is the system implemented by DKRZ¹.

This includes, for example:

training data sets for machine learning applications
open data sets of (inter-)governmental organizations
open data sets of any HPC users
project data for other projects to use
semi-public project data that should be only shared upon application
And many more!

Usage

Each data pool has a name, a version, content files (data and code), metadata files, and a README.md for other users to get started with the dataset. Pool data is either public (everyone on the cluster can access them) or non-public (grant access to specific other projects and users), but pool metadata and the README.md are always public. A website listing all available data pools is planned.

All datasets are centrally located in the /pools/data. The path to each pool follows the scheme

/pools/data/PROJECT/POOLNAME/POOLVERSION

where PROJECT is the project’s HPC Project ID (see Project Structure), POOLNAME is the name of the pool, and POOLVERSION is the specific version of the pool. The file structure inside each data pool is

Path	Type	Description
`public`	file	DRAFT ONLY: Optional empty file. If present, the pool will be public.
`README.md`	file	Documentation for the pool
`METADATA.json`	file	Pool metadata
`CITATION.bib`	file	BibTeX file with references to cite if using everything in the pool
`GENERATED_METADATA.json`	file	GENERATED: Pool metadata that can’t be generated before submission
`CHECKSUMS.EXT`	file	GENERATED: Tagged checksums of `content/CHECKSUMS_*.EXT` and all top-level files ¹
`.git*`	dir/files	GENERATED: Git repo for the top-level files other than `GENERATED_METADATA.json`
`content/`	directory	Directory holding pool content (non-public for non-public pools)
`content/CHECKSUMS_code.EXT`	file	GENERATED: Tagged checksums of every file in `content/code` ¹
`content/CHECKSUMS_data.EXT`	file	GENERATED: Tagged checksums of every file in `content/data` ¹
`content/.git*`	dir/files	GENERATED: Git repo for the `content/code` directory and `content/CHECKSUM_*.EXT` files
`content/code/`	directory	Directory holding pool data
`content/code/*`	files/dirs	The actual code of the data pool
`content/data/`	directory	Directory holding pool data
`content/data/*`	files/dirs	The actual data of the data pool

[1]: EXT is an extension based on the checksum algorithm (e.g. sha256 for SHA2-256).

Creation

Pools have to go through several phases.

Data pool workflow. — Data Pool Workflow
Overview of the data pool creation process.

0. Prerequisite: Registered project

Only projects in the HPC Project Portal (see Project Management) are eligible to create pools. See Getting An Account for information on how to apply for a project.

Warning

NHR/HLRN projects created before 2024/Q2 must migrate to the HPC Project Portal before being eligibile to create pools. See the NHR/HLRN Project Migration page for information on migration.

1. Requesting data pool staging area

A project’s draft pools are created in a staging area under /pools/data-pool-staging/PROJECT. Initially, projects don’t have access to the staging area. A project PI can request access to the staging area via a support request (see Start Here for the email address to use). The request should include a rough estimate of how much disk space and how many files/directories will be used. If approved, a directory in the staging area is created for the project.

Each individual draft pool in preparation should use a separate subdirectories of the project’s staging directory, specifically /pools/data-pool-staging/PROJECT/POOL/VERSION where POOL is the pool name and VERSION is its version.

Info

The maximum number of files/directories in a pool is limited in order to improve IO performance of anyone using the pool. For example, directories with a million files are not allowed because anyone using the pool would harm the performance of the filesystem for everyone. In many cases, it is possible to bundle together large sets of small files (see Reducing Filesystem Usage for tips).

2. Building pool draft

Project members setup draft pools in subdirectories of the staging directory (note that each individual pool is treated separately from this point in the workflow). The subdirectory must be POOL/VERSION so that the pool name and version can be deduced from the path. A template data pool is provided, which you can access via:

cp -r /pools/data-pool-template/* /pools/data-pool-staging/PROJECT/POOL/VERSION/

The template contains the basic files that must be filled out for the pool and directories that must get files:

README.md
METADATA.json
CITATION.bib
content/
content/code/
content/data/

Make sure to create an empty public file if you want the pool to be public (and make sure it doesn’t exist if the pool should not be public), which can be done by

touch /pools/data-pool-staging/PROJECT/POOL/VERSION/public

and

rm /pools/data-pool-staging/PROJECT/POOL/VERSION/public

Put the pool code and data into the content/code/ and content/data/ subdirectories respectively, whether copying it from elsewhere on the cluster or uploading it to the cluster (more details in our documentation on data transfer). There are a few hard restrictions:

No files or additional directories in content/. Everything must go under the content/code/ and content/data/ subdirectories.
All symlinks must be relative links that stay entirely inside the content/ directory and must eventually terminate on a file/directory (no circular links)
File, directory, and symlink names must all meet the following requirements:
- Hard requirements:
  - Must be UTF-8 encoded (ASCII is a subset of UTF-8)
  - Must not contain newline characters (wrecks havoc on unix shells)
  - Must be composed entirely of printable characters (only allowed whitespace is the space character)
  - Must not be a single - or double dash -- (wrecks havoc on passing to command line utilities)
  - Must not be a tilde ~ (wrecks havoc on unix shells)
  - Must not be .git (git repos are forbidden in submitted pools so that the top-level and content git repos can work)
- Recommendations
  - Do not start with a dash - (wrecks havoc on passing to command line utilities)
  - Do not start with a dot . (pools shouldn’t have hidden files, directories, and/or symlinks)
  - Do not start with .git (could cause problems for the content git repo)
  - Do not include autosave and backup files since they just waste space (files ending in ~, .asv, .backup, .bak, and .old)
  - Minimize binary files under content/code since the content git repo would include them (such files almost always belong under content/data)

You should use the CLI pool validator at /pools/data-pool-tools/bin/data-pool-tools-validate to validate the various parts of your draft pools like

/pools/data-pool-tools/bin/data-pool-tools-validate [OPTIONS] PATH

For a straightforward setup of the data pool, we will eventually provide tools (CLI / WebUI / …) to help you prepare the various files in your draft pool and to check the draft pool for problems.

3. Submitting pool for review

A project PI submits the draft pool to become an actual pool by creating a support request (see Start Here for the email address to use). The following additional information must be included in the support request

Project ID (if the project’s POSIX group is HPC_foo, then the Project ID is foo)
Pool name
Pool version
What sorts of other projects would be interested in using the data.

The pool’s path must then be /pools/data-pool-staging/PROJECT/POOL/VERSION. Eventually, this will be replaced with a web form.

Once the submission is received, a read-only snapshot of the draft pool will be created and the various generated metadata files (checksum files and GENERATED_METADATA.json) generated for review. The validator is run. If it fails, the submitter is notified and the read-only snapshot is deleted so that they can fix the problems and resubmit. If the validator passes, the submitter is notified and the draft pool goes through the review process:

All other PIs of the project are notified with the location of the draft pool snapshot and instructions on how to approve or reject the pool.
If all other PIs have approved the pool, the draft pool goes to the Data Pool Approval Team.
If the Data Pool Approval Team approves, the pool is accepted. Otherwise, they will contact the PIs.

4. Publishing Pool

Once a pool has been fully approved, it will be published in the following steps:

The pool is copied to its final location and permissions configured.
The pool metadata is added to the pool index within the data catalogue to appear in our Data Lake
The draft pool’s read-only snapshot is deleted.

Finally, your data pool is available on our HPC system to all users with a high-speed connection. Anyone can access the data directly using the path to your data pool /pools/data/PROJECT/POOL/VERSION.

5. Editing Pool

Projects are allowed to edit a pool, either submitting a new version, a non-destructive revision, or a correction. Non-destructive revisions allow the following:

Changes to top-level metadata files when the history of old versions can be kept
Changes to content/code/* when the history of old versions can be kept
Adding new files, directories, and/or symlinks under content/data

A correction is when a top-level metadata file or file under content/code must be changed and the history of the old versions destroyed and/or when existing files, directories, and/or symlinks under content/data must be changed, removed, or renamed. These situations should ideally not happen, but sometimes they are necessary. For example, one could be including an external image that seemed to be CC-BY (thus one can share it) but it turns that the person who claimed to be its owner actually stole it and the original owner does not license it as CC-BY and won’t allow it to be shared, and thus the file must be deleted outright and its content expunged (but not necessarily history of its existence).

If you need to edit your data pool, copy it to the staging directory and follow the process from step 3 Submitting pool for review except that the following additional pieces of information must be given

The pool to be editted must be specified.
It must be specified whether this is a new version, a non-destructive revision, or a correction for an existing version.
Specify what is changed (e.g. changelog)
If doing a non-destructive revision or a correction, explain why. This is particularly critical for corections since they are destructive operations which undermines the reproducibility of the scientific results others derive from the pool. These changes to a data pool version can mean that the citation that others used to acknowledge the usage of your provided data is technically not correct anymore.

6. Regular review

All data pools are periodically reviewed to determine whether the data pool should be retained or deleted (or optionally archived) when the requested availability window expires.

Managing Access to Non-Public Pools

For pools with non-public data, access to files under /pools/data/PROJECT/POOL/VERSION/content is restricted via ACL. Read access is granted to all members of the project, and any additional projects (must be in the HPC Project Portal) or specific project-specific usernames that a PI specifies. Initially, changing who else is granted access requires creating a support request. Eventually, this will be incorporated directly into the HPC Project Portal.

Data Documentation and Reuse License/s

It is recommended to follow domain specific best practices for data management, such as metadata files, file formats, etc. While helpful, this is not enough by itself to make a dataset usable to other researchers. To ensure a basic level of reusability, each data pool has README.md and METADATA.json files in their top-level directory containing a basic description of the dataset and how to use it.

These files are also critical for informing others which license/s apply to the data. All data must have a license, which should conform to international standards to facilitate re-use and ensure credit to the data creators². Different files can have different licenses, but it must be made clear to users of the pool which license each file uses. Common licenses are:

The various Creative Commons licenses for text and images
The various licenses approved by OSI for source code
CC0 for raw numerical data (not actually copyrightable in many legal jurisdictions, but this makes it so everyone has the same rights everywhere)

In addition, a CITATION.bib file is required for correct citation of the dataset when used by other HPC users. This is a good place for pool authors to place the bibliographic information for the associated paper/s, thesis, or data set citation, as some journals like Nature provide. This is for all things that would have to be cited if the whole data pool is used. If some data requires only a subset of the citations, that would be a good thing to mention in the documentation (either the README.md or some other way in under the content/ directory).

The design of these files is strongly inspired by DKRZ¹.

Warning

All information in this README.md and the METADATA.json file are publicly available, including the names and email addresses of the PI/s and creator/s of the data pool.

General Recommendations

Follow good data organization, naming, and metadata practices in your field; taking inspiration from other fields if there is none or if they don’t cover your kind of data.
Include minimal code examples to use the data. Jupyter notebooks, org-mode files, etc. are encouraged. Please place them in the content/code directory if possible.
For source code, indicate its dependencies and which environment you have successfully run it in.
Add metadata inside your data files if they support it (e.g. using Attributes in NetCDF and HDF5 files).
Provide specifications and documentation for any custom formats you are using.
Use established data file formats when possible, ideally ones that have multiple implementations and/or are well documented.
Avoid patent encumbered formats and codecs when possible.
Bundle up large numbers of small files into a fewer number of larger files.
Compress the data when possible if it makes sense (e.g. use PNG or JPEG instead of BMP).
Avoid spaces in filenames as much as possible (cause havoc for people’s shell scripts).
Use UTF-8 encoding and Unix newlines when possible (note, some formats may dictate other ones and some languages require other encodings).

Files and Templates

Submitted: public

This file in a draft pool, if it exists, indicates that the pool is public. If it does not exist, the pool is restricted. The file must have a size of zero. The easiest way to create it is via

touch /pools/data-pool-staging/PROJECT/POOL/VERSION/public

Note

Note that the file is not copied to snapshots or the final published pool. In snapshots and published pools, the information on whether it is public or not is instead in GENERATED_METADATA.json.

Submitted: README.md

The README.md should document the data and its use, so that any domain expert can use the data without contacting the project members.

It must be a Markdown document following the conventions of CommonMark plus GitHub Flavored Markdown (GFM) tables. It must be UTF-8 encoded with Unix line endings. The data pool TITLE on the first line must be entirely composed of printable ASCII characters.

The template README.md structure is

# TITLE

## owner / producer of the dataset

## data usage license

## content of the dataset

## data usage scenarios

## methods used for data creation

## issues

## volume of the dataset (and possible changes thereof)

## time horizon of the data set on /pool/data

Submitted: METADATA.json

The metadata written by the pool submitters.

It must be a JSON file. It must be UTF-8 encoded with Unix line endings. Dates and times must be in UTC. Dates must be in the "YYYY-MM-DD" format and times must be in "YYYY-MM-DDThh:mm:ssZ" format (where Z means UTC). The file should be human readable (please use newlines and indentation).

It is a JSON dictionary of dictionaries. At the top level are keys of the form "v_NUM" indicating a metadata version under which the metadata for that version of the metadata are placed. This versioning allows the format to evolve while making it extremely clear how each field should be interpreted (using the version of the key) and allowing the file to contain more than one version at once for wider compatibility. At any given time, the submission rules will dictate which version/s the submitters are required/allowed to use. Many fields are based on the CF Conventions. Most string fields must be composed entirely of printable characters with the only allowed whitespace characters being space and for some the Unix newline \n.

The template METADATA.json is

{
    "v_1": {
        "title": "TITLE",
        "pi": ["Jane Doe"],
        "pi_email": ["jane.doe@example.com"],
        "creator": ["Jane Doe", "John Doe"],
        "creator_email": ["jane.doe@example.com", "john.doe@example.com"],
        "institution": ["Example Institute"],
        "institution_address": ["Example Institute\nExample Straße 001\n00000 Example\nGermany"],
        "source": "generated from experimental data",
        "history": "2024-11-28  Created.\n2024-12-02  Fixed error in institution_address.",
        "summary": "Data gathered from pulling numbers out of thin air.",
        "comment": "Example data with no meaning. Never use.",
        "keywords": ["forest-science", "geophysics"],
        "licenses": ["CC0-1.0", "CC-BY-4.0"]
    }
}

Version 1

REQUIRED

The key is "v_1". The fields are

Key	Value type	Description
`title`	string	Title of the pool (must match `TITLE` in the `README.md`
`pi`	list of string	Name/s of the principal investigator/s
`pi_email`	list of string	Email address/es of the PI/s in the same order as `"pi"` - used for communication and requests
`creator`	list of string	Name/s of the people who made the pool
`creator_email`	list of string	Email address/es of the people who made the pool in the same order as `"creator"`
`institution`	list of string	Names of the responsible institutions (mostly the institutions of the PI/s)
`institution_address`	list of string	Postal/street address of each institution properly formatted with newlines (last line must be the country) in the same order as `"institution"`. Must be sufficient for mail sent via Deutsch Post to arrive there.
`source`	string	Method the data was produced (field from CF Conventions)
`history`	string	Changelog style history of the data (will have newlines) (field from CF Conventions)
`summary`	string	Summary/abstract for the data
`comment`	string	Miscellaneous comments about the data
`keywords`	list of string	List of keywords relevant to the data
`licenses`	list of string	List of all licenses that apply to some part of the contents. Licenses on the SPDX License List must use the SPDX identifier. Other licenses must take the form `"Other -- NAME"` given some suitable name (should be explained in the documentation).

Submitted: CITATION.bib

What paper(s), thesis(es), report(s), etc. should someone cite when using the full dataset? Written by the pool submitter. If it’s empty, the dataset cannot be cited in publications without contacting the author(s) (possibly because a publication using it hadn’t been published at the time of the pool submission).

Tip

If the citations required change for some reason (say, a manuscript is published and the citation should be changed from the preprint to the paper), you can submit a non-destructive revision for the pool version

It is a BibTeX file. It must be ASCII encoded with Unix line endings. The encoding is restricted to ASCII so it is compatible with normal BibTeX. Otherwise, a user would have to use bibtex8, bibtexu, or BibLaTeX. See https://www.bibtex.org/SpecialSymbols for how to put various non-ASCII characters into the file. An example CITATION.bib would be

@Misc{your-key,
  author =  {Musterfrau, Erika
         and Mustermann, Max},
  title =   {Your paper title},
  year =    {2024},
  edition =     {Version 1.0},
  publisher =   {Your Puplisher},
  address =     {G{\"o}ttingen},
  keywords =    {ai; llm; mlops; hpc },
  abstract =    {This is the abstract of your paper.},
  doi =     {10.48550/arXiv.2407.00110},
  howpublished= {\url{https://doi.org/10.48550/arXiv.2407.00110},
}

Generated: Top-Level Git Repo

Git repo generated during submission of the draft pool and not controlled by the submitters. It tracks the following files across revisions of the pool version:

README.md
METADATA.json
CITATION.bib
CHECKSUMS.EXT (generated)

and the various Git support files (e.g. .gitignore).

Each revision of the pool version generates gets a new tag of the form rX where X is an incrementing base-10 number starting from 0 (the first revision is r0). The latest revision is always checked out. If you want to access an earlier revision, it is best to clone the repo and then checkout using the tag of the revision you want. The list of revisions, their tags, and commit hashed can be found in the GENERATED_METADATA.json.

Generated: Content Git Repo

Git repo generated during submission of the draft pool and not controlled by the submitters. It tracks all the content of the pool except for content/data. Specifically, it tracks the following content across revisions of the pool version:

content/CHECKSUMS_code.EXT (generated)
content/CHECKSUMS_data.EXT (generated)
content/code/*

and the various Git support files (e.g. .gitignore). This means that the list of files under content/data and their checksums is stored and tracked across revisions, but not the contents themselves.

Generated: GENERATED_METADATA.json

Metadata generated during submission of the draft pool and not controlled by the submitters.

It is a JSON file. It is UTF-8 encoded with Unix line endings. Dates and times are in UTC. Dates are in the "YYYY-MM-DD" format and times are in "YYYY-MM-DDThh:mm:ssZ" format (where Z means UTC).

An example GENERATED_METADATA.json is

{
    "v_1": {
        "public": true,
        "project_id": "project123",
        "pool_id": "cooldata",
        "version" "0.39",
        "submitter": "Mno Pqr",
        "submitter_email": "pqr@uni.com",
        "commit_date": "2024-12-03",
        "commit_history": [
            [
                "r0",
                "2024-12-03",
                "5aaf25abb31252e846260ccf97cac5b412c1b1919376624dd9b1085e3bc0a385",
                "756bfb08970e7c8c6137429e2a7cd6d44726be105a7472d75aff66780f706621"
            ]
        ]
    }
}

Version 1

The key is "v_1". The fields are

Key	Value type	Description
`public`	boolean	Whether the pool is public or not
`project_id`	string	The HPC Project ID of the project
`pool_id`	string	Pool name, which is the name of its subdirectory
`version`	string	Pool version string
`submitter`	string	Name of the user who submitted the pool
`submitter_email`	string	Email address of the user who submitted the pool
`commit_date`	string	Date (UTC) the pool was submitted/finalized
`commit_history`	list of list	All previous commits as lists in order (more recent is last) of tag/revision name, `"commit_date"`, commit hash of the top-level git repo, and commit hash of the `content` git repo.

Generated: `content/CHECKSUMS_code.EXT`

Checksum file generated during submission containing the tagged checksums of all files under content/code. The extension .EXT is based on the checksum algorithm (e.g. .sha256 for SHA-2-256). Tagged checksums include the algorithm on each line and are created by passing the --tag option to Linux checksum programs like sha256sum. If a data pool only has the data file content/code/foo, the content/CHECKSUMS_code.sha256 file would be something like

SHA256 (content/code/foo) = 7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730

Generated: `content/CHECKSUMS_data.EXT`

Checksum file generated during submission containing the tagged checksums of all files under content/data. The extension .EXT is based on the checksum algorithm (e.g. .sha256 for SHA-2-256). Tagged checksums include the algorithm on each line and are created by passing the --tag option to Linux checksum programs like sha256sum. If a data pool only has the data file content/data/foo, the content/CHECKSUMS_data.sha256 file would be something like

SHA256 (content/data/foo) = 7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730

Generated: `CHECKSUMS.EXT`

Checksum file generated during submision containing the tagged checksums of the following files

README.md
METADATA.json
CITATION.bib
content/CHECKSUMS_code.EXT
content/CHECKSUMS_data.EXT

The extension .EXT is based on the checksum algorithm (e.g. .sha256 for SHA-2-256). Tagged checksums include the algorithm on each line and are created by passing the --tag option to Linux checksum programs like sha256sum. An example CHECKSUMS.sha256 file would be something like

SHA256 (README.md) = 7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730
SHA256 (METADATA.json) = d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded977307
SHA256 (CITATION.bib) = 865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded977307d
SHA256 (content/CHECKSUMS_code.sha256) = 65e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded977307d8
SHA256 (content/CHECKSUMS_data.sha256) = df2940bf16f5ada77b41f665e59e4433cff2c8ebc42e23f4b76e0187c187b73e

Get in Contact With Us

If you have any questions left, that we couldn’t answer in this documentation, we are happy to get contacted by you via Ticket (E-Mail to our support addresses). Please indicate “HPC-Data Pools” in the subject, so your request reaches us quickly and without any detours.

https://docs.dkrz.de/doc/dataservices/finding_and_accessing_data/pool-data/index.html ↩︎ ↩︎
The PI of the data project is responsible to make sure, that this is in line with the respective data licence. ↩︎

Validator

A CLI validator for pool objects and entire even draft, snapshot, and published pools is provided at /pools/data-pool-tools/bin/data-pool-tools-validate. It is used like

/pools/data-pool-tools/bin/data-pool-tools-validate [OPTIONS] PATH

where PATH is the part of your draft pool you want to validate or even the whole draft pool if you give the path to its directory. The validator autodetects what is being validated based on the specific PATH. Use the -h or --help options to see all options. The most useful ones are -o OUTPUT to make it write the results of the validation to a file rather than stdout and -f FORMAT to change the output format. The output formats are human-color for human readable with color (default on the CLI), human for human readable (default for any output that isn’t a JSON file or the CLI), json for JSON (default for any file ending in .json), and auto for choosing based on the output (default).

An example with a valid pool made from the template and one data file is

[gzadmfnord@glogin8 ~]$ cp -r /pools/data-pool-template mypool
[gzadmfnord@glogin8 ~]$ echo "this is data" > mypool/content/data/data.txt
[gzadmfnord@glogin8 ~]$ /pools/data-pool-tools/bin/data-pool-tools-validate mypool
Validated "draft" at mypool

Valid: valid

Public: yes

Flagged:
  Green: 6
    Citation: 1
      * CITATION.bib file is OK.
    Draft: 1
      * Top-level of pool is OK.
    Metadata: 1
      * METADATA.json file is OK.
    Public: 1
      * public file is OK.
    Readme: 1
      * README.md file is OK.
    Content: 1
      * content directory is OK.

Info:
  Metadata:
    title: 'TITLE'
  Readme:
    title: 'TITLE'
  Content:
    number_directories: 0
    number_files: 1
    number_symlinks: 0
    size_inodes: 1
    size_inodes_human: '1'
    size_space: 13
    size_space_human: '13 B'
    files_by_extension:
      .txt: 1 files, 13 B

The validator’s human readable output shows

What is being validated
Whether it is valid or not (valid, possibly invalid, probably invalid, or invalid)
What has been flagged
Additional information

The validator flags various things which are organized by the kind of flag followed by the kind of pool object. The different flags and their meanings are

Flag	Meaning
forbidden	Critical problem with the pool object.
red	Potentially serious problem with the pool object. Will require discussion if submitted.
yellow	Potential problem with problem with the pool object. May require discussion if submitted.
green	The pool object is OK.
awesome	Something good above and beyond that should be kept for sure.

Fix anything flagged as forbidden. The yellow and red flags are meant to denote things which might be wrong but might not be, so check them. Pools with yellow and red flags may be fine, but the flagged items will have to be discussed after submission. For example, a very large pool that is past the red threshold in space will lead to a discussion about whether the data is suitably compressed among other space saving strategies.

Here is an example of a pool that has a forbidden file and what the validator returns

[gzadmfnord@glogin8 ~]$ cp -r /pools/data-pool-template mypool2
[gzadmfnord@glogin8 ~]$ echo "my data" > mypool2/data.txt
[gzadmfnord@glogin8 ~]$ /pools/data-pool-tools/bin/data-pool-tools-validate mypool2
Validated "draft" at mypool2

Valid: invalid

Public: yes

Flagged:
  Forbidden: 1
    Draft: 1
      * Top-level directory contains a forbidden file: data.txt
  Green: 5
    Citation: 1
      * CITATION.bib file is OK.
    Metadata: 1
      * METADATA.json file is OK.
    Public: 1
      * public file is OK.
    Readme: 1
      * README.md file is OK.
    Content: 1
      * content directory is OK.

Info:
  Metadata:
    title: 'TITLE'
  Readme:
    title: 'TITLE'
  Content:
    number_directories: 0
    number_files: 0
    number_symlinks: 0
    size_inodes: 0
    size_inodes_human: '0'
    size_space: 0
    size_space_human: '0 B'
    files_by_extension:

The information section at the end shows useful information gathered at each step of the validation. Particularly useful are checking that the pool titles from the METADATA.json and README.md files match as well as the content information on the size of the pools content. Information on the pool size is given both in aggregate (size_inodes and size_space) as well as by file extension. This is useful for seeing how many files, of what kind, and how big they are in the pool. Be on the lookout for large numbers of files (inodes) or a large fraction of the pool being taken up by uncompressed files (e.g. .tar files rather than compressed .tar.zst files).

Gaudi2

Introduction

Gaudi2 is Intel’s second-generation deep learning accelerator, developed by Habana Labs (now part of Intel). Unlike traditional GPUs, Gaudi2 has been designed from the ground up for large-scale AI training. Each device is powered by Habana Processing Units (HPUs), its purpose-built AI training cores. The memory-centric architecture and Ethernet-based scale-out enable efficient training of today’s large and complex models, while offering a favorable power-to-performance ratio. The platform provides 96 GB of on-chip high-bandwidth memory per device, together with 24×100 Gbps standard Ethernet interfaces. This combination eliminates the need for proprietary interconnects and allows flexible integration into existing cluster infrastructures. On the FTP, we currently host a single Gaudi2 node equipped with 8 HL-225 HPUs, available for researchers and developers to evaluate distributed AI training.

Key Features

Memory-Centric Design

Gaudi2 features 96 GB of HBM2E (High Bandwidth Memory 2E) with 2.45 TB/s bandwidth, providing the fast memory access essential for large model training. Unlike external DRAM, HBM2E is physically stacked on the chip close to the compute cores, which reduces latency and power consumption. Tom’s Hardware has a nice article explaining HBM.

Ethernet-Based Scaling

Instead of using proprietary interconnects, Gaudi2 integrates 24×100 Gbps RoCE v2 (RDMA over Converged Ethernet) network interfaces directly on-chip. RoCE v2 enables remote direct memory access between nodes across standard Ethernet, allowing data to move directly between device memories without involving the CPU. This reduces latency, lowers CPU overhead, and provides a combined networking capacity of 2.4 Tbps per accelerator. Because it relies on standard Ethernet, distributed AI training becomes more flexible, cost-effective, and easier to deploy in existing cluster environments.

Warning

Please note that only one Gaudi2 node with 8 HPUs is currently available on FTP.

Framework Compatibility

Gaudi2 supports popular AI frameworks like PyTorch through the SynapseAI software stack, with TensorFlow support was deprecated after version 1.15. The hardware integrates seamlessly with scheduling systems such as SLURM. Check our Getting Started, for a detailed way to access and use the Gaudi2 Node on our FTP cluster.

Application Areas

Gaudi2 is particularly suited for:

Natural Language Processing (NLP)
Computer Vision (CV)
Large Language Model (LLM) training
Generative AI models (e.g., diffusion-based image synthesis)

Practical tutorials and model examples are available in the Gaudi Tutorials and Examples section.

Software Stack: SynapseAI

SynapseAI is Habana Labs’ comprehensive software ecosystem for Gaudi processors, providing everything needed to program, optimize, and run machine learning workloads efficiently.

Framework Integration

PyTorch support: Full compatibility with PyTorch through optimized plugins
TensorFlow: Support deprecated after SynapseAI version 1.15

Optimized Libraries

Pre-optimized computation kernels for matrix operations and convolutions
Runtime software for scheduling, memory management, and multi-processor communication
Development tools, including profilers, debuggers, and performance analyzers

GitHub Resources

Habana Labs GitHub

How to Access

Access to Gaudi2 is currently possible through our Future Technology Platform (FTP). You need to contact support to get access, please ensure you use FTP in the subject and mention you need to access Gaudi2. For this, an account is required (usually a GWDG account, or an AcademicCloudID for external users), which then needs to be explicitly enabled by the admins to be able to access the FTP nodes. For more information, check our documentation on getting an account.

Access requests currently run through KISSKI (researchers and companies) and NHR (researchers at Universities only). Please consult their documentation and eventually request a project to test and utilize Gaudi2. If you have related questions, you can also reach out through one of our support channels.

After gaining access to FTP, log into FTP and check if you have access to Gaudi2.

scontrol show res Gaudi2

Once you have confirmed access follow our gaudi2 tutorial to learn how to use the Gaudi2 node with Apptainer. In case you don’t have access to Gaudi2 please reach out to one of our support channels.

Gaudi2 FTP Node Configuration

Public IP: 10.238.3.35
Server Chassis: Supermicro model SYS-820GH-TNR2
Motherboard: Supermicro X12DPG-OA6-GD2
High Speed Ethernet: 2x BCM57416 NetXtreme-E Dual-Media 10G RDMA Ethernet Controller
Infiband cards: 2x Infiniband MT27800 Family
RAM: 16 × 64 GiB = 1024 GiB = 1 TiB
CPU: Xeon(R) Platinum 8380, 2 sockets, 40 cores per socket, two threads per core = 160 cpus
Storage: 2x NVMe 3.5TB Micron_7450_MTFDKCB3T8TFR
OS: Ubuntu 22.04.4 LTS
Gaudi accelerators (HPUs): 8x Gaudi2 HL-225 accelerators

Important links

Gaudi2 Getting Started

This section provides step-by-step instructions for first-time users to run machine learning workloads on Gaudi2 HPUs on the Future Technology Platform.

Initial Setup (One-Time Configuration)

1. Container Environment Setup

# Allocate resources for building the container
salloc -p gaudi --reservation=Gaudi2 --time=01:00:00 --mem=128G --job-name=apptainer-build 
# Gaudi2 can only be accessed in exclusive mode on FTP

# Load Apptainer and build the PyTorch-Habana container
module load apptainer
apptainer build ~/pytorch-habana.sif docker://vault.habana.ai/gaudi-docker/1.21.2/ubuntu22.04/habanalabs/pytorch-installer-2.6.0:latest
mkdir -p "$HOME/datasets" "$HOME/tmp" "$HOME/habana_logs"

Gaudi uses its own fork of PyTorch, and it is best to extract the latest version from the Gaudi Docker image and build a custom .sif file to work with Apptainer.

You can check the latest PyTorch Docker image files from here: Gaudi Docker Images

2. System Verification

# Enter the container to check system specifications
apptainer shell --cleanenv --contain \
  --bind "$HOME/habana_logs:/var/log/habana_logs" \
  --bind /dev:/dev \
  --env HABANA_LOGS=/var/log/habana_logs \
  "$HOME/pytorch-habana.sif"
# Verify CPU configuration
lscpu | grep -E '^CPU\(s\):|^Socket|^Core'
nproc
# Confirm HPU devices are accessible
ls /dev/accel*
# Test PyTorch HPU integration
python -c "import habana_frameworks.torch.core as ht; print(f'HPU devices: {ht.hpu.device_count()}')"
# Test HPU management system
hl-smi
exit  # Exit container shell

3. Model Repository and Directory Setup

There are several official examples available in the Habana AI GitHub. You can also look at our direct links to the examples here.

# Clone the official model reference repository
git clone https://github.com/HabanaAI/Model-References

Warning

Make sure you exit the reservation before continuing with the code below or make adjustments.

Single HPU Example: MNIST Classification

This example demonstrates basic single-HPU usage with the classic MNIST handwritten digit classification task.

sbatch -p gaudi --reservation=Gaudi2 --time=02:00:00 --exclusive \
  -J mnist-single-hpu -o mnist-single-hpu_%j.out -e mnist-single-hpu_%j.err \
  --wrap "/bin/bash -lc 'module load apptainer; \
  apptainer exec --cleanenv --contain \
    --bind \$HOME:\$HOME \
    --bind \$HOME/habana_logs:/var/log/habana_logs \
    --bind \$HOME/datasets:/datasets \
    --bind \$HOME/tmp:/worktmp \
    --bind /dev:/dev --bind /sys/class/accel:/sys/class/accel --bind /sys/kernel/debug:/sys/kernel/debug \
    --env HABANA_LOGS=/var/log/habana_logs,PT_HPU_LAZY_MODE=1,HABANA_INITIAL_WORKSPACE_SIZE_MB=8192,TMPDIR=/worktmp,TORCH_HOME=\$HOME/.cache/torch,PYTHONNOUSERSITE=1 \
    --pwd \$HOME/Model-References/PyTorch/examples/computer_vision/hello_world \
    \$HOME/pytorch-habana.sif python3 mnist.py --epochs 5 --batch-size 128 --data-path /datasets/mnist'"

Multi-HPU Example: YOLOX Object Detection

This comprehensive example demonstrates distributed training across multiple HPUs using the YOLOX object detection model with the COCO 2017 dataset.

1. YOLOX Dependencies Installation

# Set up environment variables
export SIF="$HOME/pytorch-habana.sif"
export YOLOX_DIR="$HOME/Model-References/PyTorch/computer_vision/detection/yolox"
export TMPDIR_HOST="$HOME/tmp"

mkdir -p "$TMPDIR_HOST" "$HOME/habana_logs" 
module load apptainer

# Install YOLOX requirements in the container
apptainer exec --cleanenv --contain \
  --bind "$HOME:$HOME" \
  --bind "$HOME/habana_logs:/var/log/habana_logs:rw" \
  --bind "$TMPDIR_HOST:/worktmp" \
  --env HOME=$HOME,TMPDIR=/worktmp,PIP_CACHE_DIR=/worktmp/pip,PIP_TMPDIR=/worktmp,XDG_CACHE_HOME=/worktmp \
  --pwd "$YOLOX_DIR" \
  "$SIF" bash -lc '
    python3 -m pip install --user --no-cache-dir --prefer-binary -r requirements.txt
    python3 -m pip install -v -e .
    python3 - <<PY
import site,loguru
print("USER_SITE:", site.getusersitepackages())
print("loguru:", loguru.__version__)
PY'

2. COCO 2017 Dataset Download

export DATA_COCO="$HOME/datasets/COCO"
mkdir -p "$DATA_COCO"

sbatch -p gaudi --reservation=Gaudi2 --time=02:00:00 --exclusive \
  -J coco-download -o coco-download_%j.out -e coco-download_%j.err \
  --wrap "/bin/bash -lc '
    set -euo pipefail
    apptainer exec --cleanenv --contain \
      --bind $HOME:$HOME \
      --bind $DATA_COCO:/data/COCO \
      --bind $TMPDIR_HOST:/worktmp \
      --env TMPDIR=/worktmp,YOLOX_DATADIR=/data/COCO \
      --pwd $YOLOX_DIR \
      \"$SIF\" bash -lc \"set -euo pipefail
        echo Using YOLOX_DATADIR=\\\$YOLOX_DATADIR
        source download_dataset.sh

        # Sanity: ensure annotation files exist
        test -f /data/COCO/annotations/instances_train2017.json
        test -f /data/COCO/annotations/instances_val2017.json

        # Patch annotations to guarantee an 'info' key (prevents pycocotools KeyError)
        python3 - <<'PY'
import json, os, sys
paths = [
  '/data/COCO/annotations/instances_train2017.json',
  '/data/COCO/annotations/instances_val2017.json'
]
for p in paths:
    with open(p, 'r', encoding='utf-8') as f:
        d = json.load(f)
    if 'info' not in d or not isinstance(d['info'], dict):
        d['info'] = {'description':'COCO 2017','version':'1.0'}
        with open(p, 'w', encoding='utf-8') as f:
            json.dump(d, f)
        print(f\"Patched {os.path.basename(p)}: added 'info'\")
    else:
        print(f\"{os.path.basename(p)} already has 'info'\")
PY

        # Quick listing
        ls -l /data/COCO
        ls -l /data/COCO/annotations
        ls -l /data/COCO/train2017 | head -n 5
        ls -l /data/COCO/val2017   | head -n 5
      \"
  '"

3. Multi-HPU Training Configurations

Single HPU Training

sbatch -p gaudi --reservation=Gaudi2 --time=01:00:00 --exclusive \
  -J yolox-1hpu-training -o yolox-1hpu-training_%j.out -e yolox-1hpu-training_%j.err \
  --wrap "/bin/bash -lc '
    SIF=\$HOME/pytorch-habana.sif
    YOLOX_DIR=\$HOME/Model-References/PyTorch/computer_vision/detection/yolox
    DATA_COCO=\$HOME/datasets/COCO
    TMPDIR_HOST=\$HOME/tmp
    module load apptainer || true
    apptainer exec --cleanenv --contain \
      --bind \$HOME:\$HOME \
      --bind \$DATA_COCO:/data/COCO \
      --bind \$HOME/habana_logs:/var/log/habana_logs:rw \
      --bind \$TMPDIR_HOST:/worktmp \
      --bind /dev:/dev --bind /sys/class/accel:/sys/class/accel --bind /sys/kernel/debug:/sys/kernel/debug \
      --env PT_HPU_LAZY_MODE=1,TMPDIR=/worktmp,YOLOX_DATADIR=/data/COCO,MASTER_ADDR=localhost,MASTER_PORT=12355,PYTHONPATH=\$YOLOX_DIR:\$HOME/.local/lib/python3.10/site-packages:\$PYTHONPATH \
      --pwd \$YOLOX_DIR \
      \"\$SIF\" bash -lc \"python3 -u tools/train.py --name yolox-s --devices 1 --batch-size 64 --data_dir /data/COCO --hpu \
        steps 100 output_dir ./yolox_output\"
  '"

4-HPU Distributed Training (MPIrun)

sbatch -p gaudi --reservation=Gaudi2 --time=02:00:00 --exclusive \
  -J yolox-4hpu-training -o yolox-4hpu-training_%j.out -e yolox-4hpu-training_%j.err \
  --wrap "/bin/bash -lc '
    SIF=\$HOME/pytorch-habana.sif
    YOLOX_DIR=\$HOME/Model-References/PyTorch/computer_vision/detection/yolox
    DATA_COCO=\$HOME/datasets/COCO
    TMPDIR_HOST=\$HOME/tmp
    module load apptainer || true
    apptainer exec --cleanenv --contain \
      --bind \$HOME:\$HOME \
      --bind \$DATA_COCO:/data/COCO \
      --bind \$HOME/habana_logs:/var/log/habana_logs:rw \
      --bind \$TMPDIR_HOST:/worktmp \
      --bind /dev:/dev --bind /sys/class/accel:/sys/class/accel --bind /sys/kernel/debug:/sys/kernel/debug \
      --env HOME=\$HOME,PT_HPU_LAZY_MODE=1,TMPDIR=/worktmp,YOLOX_DATADIR=/data/COCO,MASTER_ADDR=localhost,MASTER_PORT=12355,PYTHONPATH=\$YOLOX_DIR:\$HOME/.local/lib/python3.10/site-packages:\$PYTHONPATH \
      --pwd \$YOLOX_DIR \
      \"\$SIF\" bash -lc \"mpirun -n 4 --bind-to core --rank-by core --report-bindings --allow-run-as-root \
        python3 -u tools/train.py --name yolox-s --devices 4 --batch-size 64 --data_dir /data/COCO --hpu \
        steps 100 output_dir ./yolox_output\"
  '"

4-HPU Distributed Training (Torchrun)

sbatch -p gaudi --reservation=Gaudi2 --time=02:00:00 --exclusive \
  -J yolox-4hpu-training -o yolox-4hpu-training_%j.out -e yolox-4hpu-training_%j.err \
  --wrap "/bin/bash -lc '
    SIF=\$HOME/pytorch-habana.sif
    YOLOX_DIR=\$HOME/Model-References/PyTorch/computer_vision/detection/yolox
    DATA_COCO=\$HOME/datasets/COCO
    TMPDIR_HOST=\$HOME/tmp
    module load apptainer || true
    apptainer exec --cleanenv --contain \
      --bind \$HOME:\$HOME \
      --bind \$DATA_COCO:/data/COCO \
      --bind \$HOME/habana_logs:/var/log/habana_logs:rw \
      --bind \$TMPDIR_HOST:/worktmp \
      --bind /dev:/dev --bind /sys/class/accel:/sys/class/accel --bind /sys/kernel/debug:/sys/kernel/debug \
      --env PT_HPU_LAZY_MODE=1,TMPDIR=/worktmp,YOLOX_DATADIR=/data/COCO,MASTER_ADDR=localhost,MASTER_PORT=12355,PYTHONPATH=\$YOLOX_DIR:\$HOME/.local/lib/python3.10/site-packages:\$PYTHONPATH \
      --pwd \$YOLOX_DIR \
      \"\$SIF\" bash -lc \"mpirun -n 4 --bind-to core --rank-by core --report-bindings \
        python3 -u tools/train.py --name yolox-s --devices 4 --batch-size 64 --data_dir /data/COCO --hpu \
        steps 100 output_dir ./yolox_output\"
  '"

8-HPU Maximum Scale Training (MPIrun)

sbatch -p gaudi --reservation=Gaudi2 --time=05:00:00 --exclusive \
  -J yolox-8hpu-training -o yolox-8hpu-training_%j.out -e yolox-8hpu-training_%j.err \
  --wrap "/bin/bash -lc '
    SIF=\$HOME/pytorch-habana.sif
    YOLOX_DIR=\$HOME/Model-References/PyTorch/computer_vision/detection/yolox
    DATA_COCO=\$HOME/datasets/COCO
    TMPDIR_HOST=\$HOME/tmp
    module load apptainer || true
    apptainer exec --cleanenv --contain \
      --bind \$HOME:\$HOME \
      --bind \$DATA_COCO:/data/COCO \
      --bind \$HOME/habana_logs:/var/log/habana_logs:rw \
      --bind \$TMPDIR_HOST:/worktmp \
      --bind /dev:/dev --bind /sys/class/accel:/sys/class/accel --bind /sys/kernel/debug:/sys/kernel/debug \
      --env PT_HPU_LAZY_MODE=1,TMPDIR=/worktmp,YOLOX_DATADIR=/data/COCO,MASTER_ADDR=localhost,MASTER_PORT=12355,PYTHONPATH=\$YOLOX_DIR:\$HOME/.local/lib/python3.10/site-packages:\$PYTHONPATH \
      --pwd \$YOLOX_DIR \
      \"\$SIF\" bash -lc \"mpirun -n 8 --bind-to core --rank-by core \
        python3 -u tools/train.py --name yolox-s --devices 8 --batch-size 64 --data_dir /data/COCO --hpu \
        steps 100 output_dir ./yolox_output eval_interval 1000000 data_num_workers 2\"

  '"

8-HPU Maximum Scale Training (Torchrun)

sbatch -p gaudi --reservation=Gaudi2 --time=05:00:00 --exclusive \
  -J yolox-8hpu-torchrun -o yolox-8hpu-torchrun_%j.out -e yolox-8hpu-torchrun_%j.err \
  --wrap "/bin/bash -lc '
    SIF=\$HOME/pytorch-habana.sif
    YOLOX_DIR=\$HOME/Model-References/PyTorch/computer_vision/detection/yolox
    DATA_COCO=\$HOME/datasets/COCO
    TMPDIR_HOST=\$HOME/tmp
    module load apptainer || true
    apptainer exec --cleanenv --contain \
      --bind \$HOME:\$HOME \
      --bind \$DATA_COCO:/data/COCO \
      --bind \$HOME/habana_logs:/var/log/habana_logs:rw \
      --bind \$TMPDIR_HOST:/worktmp \
      --bind /dev:/dev --bind /sys/class/accel:/sys/class/accel --bind /sys/kernel/debug:/sys/kernel/debug \
      --env PT_HPU_LAZY_MODE=1,TMPDIR=/worktmp,YOLOX_DATADIR=/data/COCO,MASTER_ADDR=localhost,MASTER_PORT=12355,PYTHONPATH=\$YOLOX_DIR:\$HOME/.local/lib/python3.10/site-packages:\$PYTHONPATH \
      --pwd \$YOLOX_DIR \
      \"\$SIF\" bash -lc \"torchrun --nproc_per_node=8 tools/train.py --name yolox-s --devices 8 --batch-size 64 --data_dir /data/COCO --hpu \
        steps 100 output_dir ./yolox_output eval_interval 1000000 data_num_workers 2\"
  '"

Gaudi2 Tutorials and Examples

Here are some official tutorials and examples. You will need to have access to FTP and finish the one-time setup before trying any of these.

GöDL - Data Catalog

The GöDL Data Catalog is a standalone service that can index all files stored on our HPC systems. It makes files findable and accessible via semantic and domain specific metadata. Therefore, users do not have to remember explicit paths or create complicated directory trees to encode metadata within paths and filenames.

If you want access to your own namespace, you can request access via mail to support@gwdg.de using the subject “Access to GöDL HPC”.

Current Situation

Currently, users rely on hierarchical folder structures and well-defined filenames to organize their data with respect to domain-specific metadata. However, this can lead to confusion since our HPC systems provide multiple storage systems, so data can be distributed. Users can understandably struggle to get a global view of their data across all provided filesystems. In addition, the access pattern is often very inefficient. Such an example is shown in the image below, where users may have to remember the exact storage location or travserse the tree to find the data they are looking for.

Hierarchical file tree encoding semantic information in file paths.

What Are the Challenges with Nested Folder Stucture?

Difficult to Find Data: Users must remember storage locations, which is not always intuitive
No Efficient Search Methods: Finding specific data requires manually searching through directories
Does Not Scale Well: As data grows, managing it manually becomes impractical
Complex File Structures: Can lead to confusion and inefficient access patterns.

Why Use a Data Catalog?

Data catalogs help to index data based on user-provided metadata, which enables efficient and user-friendly searching for data.In addition, the data catalog helps to manage data during their lifecycle by providing commands to move, stage and delete files.

What are the Benefits using Data Catalog?

Improved Search-ability: Quickly search and find relevant data using metadata or tags.
Better Organization: Organize data using metadata, making it easier to categorize and access.
Scalability: A Data Catalog grows with your data, making it easier to manage large datasets over time.

Usage Example with and whithout Data Catalog

Goal:A literature researcher is searching for datasets for books published between 2005-2007 in the Horror genre.

Szenrio Whithout a Data Catalog Without a Data Catalog, the researcher must rely on a manually structured folder hierarchy. Consider the above folder strcuture in chapter current situation

To find the relevant datasets, the researcher must:

Search for Horror books within each year’s folder.
Cross check multiple directories to ensure all relevant books are found.
Repeat the process if new books are added later

Szenario With a Data Catalog: Simple Metadata Search: A Data Catalog allows the researcher to find relevant books instantly by searching metadata instead of browsing through directories

Search Query Example: "year=2005-2007,genre=Horror"

Conclusion Why a Data Catalog is more efficient: A Data Catalog eliminates the manual workload allowing researchers to reduce workload and focus more on their core analysis

Preparation

Generate JSON Files

To ingest data into the Data Catalog, users can either create a JSON file for each data file they want to register or annotate each file manually. The first option provides convenient capabilities for a bulk upload. The JSON file must contain the metadata and must be placed in the same folder as the corresponding data file. It is also crucial to save the JSON file with the same name of the corresponding data file, i.e., for a data file called example.txt the corresponding metadata must be located in a file example.json. The metadata users save for their data is not predefined allowing users to define domain-specific metadata based on their needs. This flexibility ensures that different research fields can include relevant metadata attributes.

For more clarity an example format of file is given:

{
  "researchfield": "Cardiology",
    "age": "45",
    "gender": "Male",
}

The descriptive metadata in this JSON file serves as searchable keywords in the Data Catalog. Users can query datasets based on these attributes.

Ideally, these JSON files can be created automatically!

Usage Operation

All commands have the same structure when using the goedl cli tool.

goedl: This is an alias or shortcut for executing the Python CLI script (cli.py) and loading your config data (goedl.json).

operations: is a placeholder for the operation you want to execute (e.g., --delete, --ingest-folder, stage, migrate, annotate).

parameter: A placeholder for any additional arguments or parameters required by the chosen operation.

Syntax in Commands

Please use the following syntax rules when specifying commands:

= (Equals): Used to specify exact matches. For example, “Year=2002”
- searches for all data where the year is exactly 2002.
=< (Less Than or Equal to): Specifies that a value is less than or equal to the given value. For example, “Year=<2005”
- returns data from 2005 and earlier.
=> (Greater Than or Equal to): Specifies that a value is greater than or equal to the given value. For example, “Year=>2002”
- returns data from 2002 and later years.
from - to: Used to define a range between two values. For example, “Year=2002-2005”
- searches for data within this specific range of years.

Additional Rules

Multiple conditions in a query must be separated by a comma without space
- Correct: “Year=2002,Year=2005”
- Incorrect: “Year=2002, Year=2005”
Everything within the query must be in quotes “…”
- Correct: “Year=2002”
- Incorrect: Year=2002

Ingest Data

The operation ingest-folder is used to ingest JSON metadata files created in the previous step. The command expects a folder directory where each data file has a corresponding JSON sidecar file with the necessary metadata. If the target folder is nested, this tool will recursively traverse all paths.

Command Example: goedl --ingest-folder ./test-data/ This command ingests all data from the folder test-data

Annotate Data

Similar to the ingest command, you can also manually annotate data by adding metadata to a specific data object. This allows you to enrich existing datasets with additional descriptive attributes or to ingest new files if there are not yet any annotations available. Example Command: goedl --annotate "Season=Winter,Vegetation=Subtropical" --file ~/test/no_trees/Industrial_4.jpg

This command adds the metadata “Season=Winter” and “Vegetation=Subtropical” to the file Industrial_4.jpg.

Listing Available Data

The operation list list all Data that match a given descriptive metadata query. The matching query is determined by the search parameter provided by the user.

Example giving: goedl --list "PatientAge=25-30"

This command lists all datasets where the metadata contains the attribute PatientAge and its value is in the range of 25 to 30.

Optional

Limiting the Output Size You can limit the number of returned results using the –size argument. --size
- Example Command: goedl --list "PatientAge=25-30" --size 3 This command limits the output to 3 results
Display Full Technical Metadata By default, only descriptive metadata is shown. If you need the full technical metadata, use the –full argument.
- Example Command: goedl --list "PatientAge=25-30" --full
This will display all stored metadata for each matching data object, including technical attributes stored in the database.

Data Staging

Before processing data in a job, it is highly recommended to stage the data into a hot storage. This improves accessibility for compute nodes and enhances performance. You can learn more about the staging process here.

The operation stage Copies all data matching the defined query after the stage statement to the specified target directory

Example Command: goedl --stage "PatientAge=25-30" --target ./test-stage/

This command stages all datasets where the metadata contains “reason=testing” into the directory ./test-stage/.

Data Migration

The operation migrate moves data matching the defined query after the migrate statement to the specified target directory. Here, the handle is updated, meaning that the specified target directory will become the new source storage for the specified data.

Example giving: goedl --migrate "PatientAge=25-30" --target ./test-migrate/

This command moves all datasets where the metadata contains “reason=testing” to ./test-migrate/ and updates the reference to reflect the new storage location.

Important Notice:

After migration, the data will no longer reside in its previous location.
The metadata in the Data Catalog is automatically updated

Delete Data

The delete operation removes all data that matches the specified key-value query.

Example Command: goedl --delete "Region=Africa"

This command permanently removes all datasets where the metadata contains “Region=Africa”.

Deleting data will also remove its associated metadata from the catalog. Ensure that you no longer need the data before executing this command.

All user operations are defined in CSL and can be accessed through the command: goedl --help

Addendum - Config file

This JSON file, named godl.json, is stored in the home directory because the script only searches for this file in that location. Below is an example of the config file:

{
    "config1": {
        "username": "{username}",
        "password": "{password}",
        "index": "{MyIndex}",
        "url": "{https://es.gwdg.de}"
    },
// other config Data

:warning: Warning: Do not delete or modify the config file, as it is crucial for using the Data Catalog.

Graphcore

Graphcore Intelligence Processing Unit (IPU) is a highly parallel processor which is specifically designed to accelerate Machine Learning and Artificial Intelligence applications. IPU has a unique memory architecture which allows it to hold much more data within IPU than other processors. IPU-Machine is a compute platform consisting of 1U chassis that includes 4 IPUs and up to 260 GB of memory. IPU-Machines can also be used to make larger compute systems. Multiple IPUs can be used together on a single task where they communicate through IPU-Fabric as shown in the image below.

Source: Official Documentation

More information about Graphcore can be found on their official webpage and in this programmer’s guide.

How to Access

Access to Graphcore is currently possible through our Future Technology Platform (FTP). You can contact via support and use FTP in the subject. For this an account is required (usual GWDG account, or AcademicCloudID for external users), that then needs to be explicitly enabled to access the FTP. For more information, check our documentation on getting an account.

Access requests currently run through KISSKI (researchers and companies) and NHR (researchers at Universities only). Please consult their documentation and eventually request a project to test and utilize Graphcore. If you have questions, you can also reach out through one of our support channels .

How to Use

Graphcore provides an Poplar SDK that helps in creating graph software for machine intelligence applications. Poplar integrates with TensorFlow, PyTorch which allows developers to use their existing development tools and models.

In the following, we will see how to train a model using MNIST data.

Once you have access to the FTP nodes, you can use the following command to login to FTP;

ssh <username>@login.ftp.hpc.gwdg.de

Note

Make you are connected to VPN

Download the Poplar SDK

wget -O 'poplar_sdk-rhel_8-3.4.0-69d9d03fd8.tar.gz' 'https://downloads.graphcore.ai/direct?package=poplar-poplar_sdk_rhel_8_3.4.0_69d9d03fd8-3.4.0&file=poplar_sdk-rhel_8-3.4.0-69d9d03fd8.tar.gz'
tar -xvzf poplar_sdk-rhel_8-3.4.0-69d9d03fd8.tar.gz

Enable Poplar SDK

cd poplar_sdk-rhel_8-3.4.0+1507-69d9d03fd8
source enable

Note

Poplar-SDK must be enabled everytime you login. To avoid doing it everytime, you can also add it to .bashrc.

Test whether it is enabled or not by running the following command:

popc --version` # POPLAR version 3.4.0 (0a785c2cb5)

Running example code on Graphcore. Here, we will show how to train MNIST model on Graphcore using PyTorch:

git clone https://github.com/graphcore/examples.git
cd examples/tutorials/simple_applications/pytorch/mnist/
salloc --partition=graphcore` # learn about `salloc` [here](/how_to_use/slurm).
conda activate torch_env` # next section explains how to create the virtual environment
python mnist_poptorch.py

Setup for TensorFlow

Create conda environment:

conda create -n tensor_env python=3.9
conda activate tensor_env
pip install pip==23.1
cd poplar_sdk-rhel_8-3.4.0+1507-69d9d03fd8
pip install tensorflow-2.6.3+gc3.4.0+253429+b2127bbabc0+amd_znver1-cp39-cp39-linux_x86_64.whl
pip install keras-2.6.0+gc3.4.0+253427+164c0e60-py2.py3-none-any.whl
pip install ipu_tensorflow_addons-2.6.3+gc3.4.0+253427+6dcfc49-py3-none-any.whl

Note

It must be Python version 3.9 and pip version 23.1.

Test the installation with the following code:

python3 -c "from tensorflow.python import ipu"

More information about running ML/AI pipeline using TenorFlow can be found here and here.

Setup for PyTorch

Create conda environment:

conda create -n torch_env python=3.9
conda activate torch_env
pip install pip==23.1
cd poplar_sdk-rhel_8-3.4.0+1507-69d9d03fd8
pip install poptorch-3.4.0+114286_3d9956d403_rhel_8-cp39-cp39-linux_x86_64.whl

Note

It must be Python version 3.9 and pip version 23.1.

Test the installation with the following code:

python3 -c "import poptorch; print(poptorch.__version__)"

More information about running ML/AI pipeline using PyTorch can be found here and here.

Image AI

Image AI is a generative AI tool for creating visual content. It supports text-to-image generation, where a natural-language prompt is transformed into a new image, and image-to-image transformation, where an existing image is modified or restyled according to a prompt or style reference. As with other GWDG services, users’ data is kept secure and is not stored or used to train AI models. The simple, user-friendly web interface makes generating and saving images easy.

Current Models

Image AI provides state-of-the-art open-weight models which are hosted on our platform with high standards of data protection. The data sent to these models, including the prompts and images, are never stored at any location on our systems.

Available models are regularly upgraded as newer, more capable ones are released. We select models to include in our services based on user demand, cost, and performance across various benchmarks. Certain models are more capable at specific tasks, which are described below to the best of our knowledge.

Organization	Model	Advantages
🇺🇸 Black Forest Labs	FLUX.1-schnell	Text-to-Image, Fast and High-quality outputs
🇨🇳 Alibaba Cloud	Qwen-Image-Edit	Image-to-Image, semantic and appearance editing

Web interface and usage

The web interface can be reached here.

Here is a brief description of the components of the web interface:

Model selection: Choose the model you want to use.
Prompt: Enter your text description to generate an image.
Upload: Upload an image to use with the image-to-image model.
Advanced options: Set the width, height, and number of output images.

Open-weight models, hosted by GWDG

FLUX.1-schnell

FLUX.1-schnell is a fast, high-performance text-to-image generation model optimized for speed while maintaining strong visual quality. This model is developed by Black Forest Labs.

Key capabilities:

Rapid generation of images from text prompts.
Maintains coherent composition and style despite the speed focus.
Supports diverse artistic and photorealistic outputs.
Good at rendering complex scenes with reasonable fidelity.

You can find more details about this model on this page.

Qwen-Image-Edit

The Qwen-Image-Edit model takes images and a text prompt as input. You can upload an image and edit it by providing a text description as a prompt. Its capabilities for both semantic and appearance editing make this model a powerful tool for image modification. Another key feature is the ability to add, delete, or modify text within your uploaded image. You can find more details about this model on this page.

Data Privacy

With user authentication via SSO and by running the service on GWDG’s HPC infrastructure, we ensure the privacy and security of user data.

Image-AI FAQ

Data Privacy

Are my prompts or usage data used for AI training or similar purposes?

No, your prompts and data are not used to train any AI models.

Are my prompts and generated images stored on your servers at any stage?

The user prompt is only stored on the GWDG server during the inference process itself. After the end of a session in the browser, the user’s entries are no longer available. The generated images are not stored.

What data does Image AI keep when I use the service?

A log is kept which contains the number of requests per user and the respective time stamps. The logs are stored for one year in accordance with GWDG guidelines. The collection of data for the provision of the website and the storage of the data in log files is absolutely necessary for the operation of the website. Consequently, there is no possibility for the user to object.

Availability

My institution is interested in using Image AI. Can we advertise it to our users? Would you be able to handle an additional load for XXX users?

For large institutions, please contact us directly at info@kisski.de.

Are Image AI services for free?

Image AI services that are accessible to a user with an AcademicCloud account are for free.

Data Privacy Notice

The following English translation of the “Datenschutzerklärung” is for information purposes only. Only the German version is legally binding.

I. Responsible for data processing

The responsible party within the meaning of the General Data Protection Regulation and other national data protection laws of the member states as well as other legal data protection provisions is:

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Burckhardtweg 4
37077 Göttingen
Germany
Tel: +49 (0) 551 39-30001
E-Mail: support@gwdg.de
Website: www.gwdg.de

II. Contact person / Data protection officer

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Datenschutzbeauftragter
Burckhardtweg 4
37077 Göttingen
Germany
E-Mail: support@gwdg.de

III. Description and scope of data processing

Scope of application in the case of individual agreements

In the event of a conflict between these data protection provisions and the terms of one or more agreement(s), e.g. an order processing agreement concluded with GWDG, the terms of such agreement(s) shall always prevail. Cardinal obligations always take precedence over these general provisions. In case of doubt, you can find out from your institute which data protection guidelines apply to you.

Service Overview

ImageAI is an AI-based image generation service. The HPC (High Performance Computing) architecture is utilised by the ImageAI service, which uses the FLUX.1 [schnell] model to generate images from user data. The user data remains secure. The user-friendly web interface allows users to use the service very intuitively and create the desired image very quickly.

The main component of this service is image generation, which can be accessed via the web interface.

User authentication takes place via SSO. Data protection and security aspects of this internal service are critical as they utilise the GWDG clusters for inference.

Usage of the Image AI website

Each time our website is accessed, our system automatically collects data and information from the computer system of the accessing computer.

In order to use the Image AI services hosted by the GWDG, user input/requests are collected from the website and processed on the HPC resources. Protecting the privacy of user requests is of fundamental importance to us. For this reason, our service does not store your prompt or generated images, nor are requests or responses stored on a permanent memory at any time. The number of requests per user and the respective time stamps are recorded so that we can monitor the use of the system and perform billing. The following data is stored to fulfill the service:

Date of access
Name of the operating system installed on the accessing device
Name of the browser used
Source system via which the access was made
The IP address of the accessing device
The data is also stored in the log files of our system. This data is not stored together with other personal data of the user.

All Image-AI data displayed in the browser is only stored in the user’s browser on the client side and is only transmitted to the server for the necessary processing when the user requests it, i.e. while the data is being processed by the backend models. After the end of a session in the browser, the user input is not stored any more.

Data processing when creating accounts

When creating an account, the so-called “double opt-in” procedure is used. This means that after your registration, we send you an e-mail to the e-mail address you provided, which contains a link that you must call to confirm the creation of this account.

The following data, in addition to the above, is stored when an account is created:

E-mail address
Name and first name
Mobile phone number (if provided)
Date and time of the times of registration and confirmation

The following data can optionally be provided by you after the account has been created:

Additional e-mail address(es)
Salutation and title
Date of birth
Additional telephone number(s)
Postal address(es)
Security-specific settings (security questions and answers; two-factor authentication)

Each time you log in with an existing account on our website, our system automatically collects further data on the basis of previously mentioned information. The following data is collected during actions in the logged-in state:

Date of access
Purpose or action on the website (e.g. changing/re-setting passwords; failed log-on attempts etc.)
Name of the operating system installed on the accessing device
Name of the used browser
Source system via which the access was made
The IP address of the accessing device, with the last two bytes masked before the first storage (example: 192.168.xxx.xxx). The abbreviated IP address cannot be associated with the accessing computer.
An estimate of the location of the accessing client based on the IP address

IV. Purpose of the data processing

We only process our users’ personal data to the extent necessary to provide a functional website and our content and services.

The recording of user input via our website and the processing of user input on our HPC system is necessary in order to be able to generate a response using the selected Image AI service.

The data is stored in log files to ensure the functionality of the website. The data also helps us to optimise the website and ensure the security of our IT systems. The data is not used for marketing purposes in this context.

The processing of our users’ personal data only takes place regularly with the user’s consent. An exception applies in cases where prior consent cannot be obtained for factual reasons and the processing of the data is permitted by law.

V. Legal basis for data processing

As we obtain the consent of the data subject for the processing of personal data, Art. 6 para. 1 lit. a EU General Data Protection Regulation (GDPR) serves as the legal basis.

When processing personal data that is necessary for the fulfilment of a contract to which the data subject is a party, Art. 6 para. 1 lit. b GDPR serves as the legal basis. This also applies to processing operations that are necessary for the performance of pre-contractual measures.

Insofar as the processing of personal data is necessary to fulfil a legal obligation to which our company is subject, Art. 6 para. 1 lit. c GDPR serves as the legal basis.

In the event that vital interests of the data subject or another natural person require the processing of personal data, Art. 6 para. 1 lit. d GDPR serves as the legal basis.

If the processing is necessary to safeguard a legitimate interest of our company or a third party and if the interests, fundamental rights and freedoms of the data subject do not outweigh the first-mentioned interest, Art. 6 para. 1 lit. f GDPR serves as the legal basis for the processing.

VI. Retention period and mandatory data

The input is only stored on the GWDG server during the inference process itself. After the end of a session in the browser, the user’s entries are no longer available. In addition, a log is kept which contains the number of requests per user and the respective time stamps. The logs are stored for one year in accordance with GWDG guidelines. The collection of data for the provision of the website and the storage of the data in log files is absolutely necessary for the operation of the website. Consequently, there is no possibility for the user to object.

VII. Rights of data subjects

You have the right to withdraw your declaration of consent under data protection law at any time. The withdrawal of consent shall not affect the lawfulness of processing based on consent before its withdrawal.

The supervisory authority for the processing of personal data conducted by GWDG is the following:

Landesbeauftragte für den Datenschutz Niedersachsen
Postfach 221, 30002 Hannover
E-Mail: poststelle@lfd.niedersachsen

Datenschutzhinweis

Datenschutzerklärung für Image-AI | Privacy Notice

Our binding Privacy Notice is in German. A non-binding English translation might be published soon for your convenience.

I. Verantwortlich für die Datenverarbeitung

Der Verantwortliche im Sinne der Datenschutz-Grundverordnung und anderer nationaler Datenschutzgesetze der Mitgliedsstaaten sowie sonstiger datenschutzrechtlicher Bestimmungen ist die:

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Burckhardtweg 4
37077 Göttingen
Deutschland
Tel: +49 (0) 551 39-30001
E-Mail: support@gwdg.de
Website: www.gwdg.de

II. Ansprechpartner / Datenschutzbeauftragter

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Datenschutzbeauftragter
Burckhardtweg 4
37077 Göttingen
Deutschland
Tel: +49 (0) 551 39-30001
E-Mail: support@gwdg.de

III. Beschreibung und Umfang der Datenverarbeitung

Geltungsbereich im Falle individueller Vereinbarungen

Übersicht über den Service

ImageAI ist ein KI-basierter Bild Generierungsdienst. Die HPC-Architektur (High Performance Computing) wird durch den ImageAI-Dienst genutzt, der das FLUX.1 [schnell] Modell, um Bilder aus Benutzerdaten zu generieren. Die Benutzerdaten bleiben sicher. Die benutzerfreundliche Weboberfläche ermöglicht es den Nutzern, den Dienst sehr intuitiv zu nutzen und das gewünschte Bild sehr schnell zu erstellen.

Die Hauptkomponente dieses Dienstes ist die Bilderzeugung, auf die über die Weboberfläche zugegriffen werden kann.

Benutzerauthentifizierung erfolgt über SSO. Der Datenschutz und die Sicherheit Aspekte dieses internen Dienstes sind entscheidend, da sie die GWDG Clusters für Inferenz nutzen.

Nutzung der Image-AI Webseite

Bei jedem Aufruf von https://image-ai.academiccloud.de/ erfasst das System automatisiert Daten und Informationen vom Computersystem des aufrufenden Rechners.

Folgende Daten werden hierbei in jedem Fall erhoben:

Datum des Zugriffs
Name des auf dem zugreifenden Gerät installierten Betriebssystems
Name des verwendeten Browsers
Quellsystem, über welches der Zugriff erfolgt ist
Die IP-Adresse des zugreifenden Geräts

Die Daten werden ebenfalls in den Logfiles unseres Systems gespeichert. Eine Speicherung dieser Daten zusammen mit anderen personenbezogenen Daten des Nutzers findet nicht statt.

Sämtliche im Browser angezeigten Daten von Image-AI werden nur Clientseitig im Browser der Nutzenden vorgehalten und nur bei der Benutzer-gewünschten Anfrage für die notwendige Verarbeitung an die Server übermittelt, d.h. während die Daten von den Backend-Modellen verarbeitet werden. Nach dem Ende einer Session im Browser sind keine Eingaben des Nutzers mehr vorhanden.

Einrichten von Accounts

Beim Erstellen eines Kontos wird das so genannte „Double-Opt-In“-Verfahren verwendet. Das bedeutet, dass wir Ihnen nach Ihrer Anmeldung eine E-Mail an die von Ihnen angegebene E-Mail-Adresse senden, die einen Link enthält, den Sie aufrufen müssen, um die Einrichtung dieses Kontos zu bestätigen.

Bei der Einrichtung eines Kontos werden neben den oben genannten Daten auch folgende Daten gespeichert:

E-Mail Adresse
Name und Vorname
Mobiltelefonnummer (falls angegeben)
Datum und Uhrzeit der Anmeldung und Bestätigung

Die folgenden Daten können optional von Ihnen nach Erstellung des Kontos angegeben werden:

Zusätzliche E-Mail Adresse(n)
Anrede und Titel
Geburtsdatum
Zusätzliche Telefonnummer(n)
Postanschrift(en)
Sicherheitsspezifische Einstellungen (Sicherheitsfragen und -antworten; Zwei-Faktor-Authentifizierung)

Bei jedem Einloggen mit einem bestehenden Konto auf unserer Website erhebt unser System automatisch weitere Daten auf der Grundlage der zuvor genannten Informationen. Die folgenden Daten werden bei Aktionen im eingeloggten Zustand erhoben:

Datum des Zugriffs
Zweck oder Aktion auf der Website (z.B. Ändern/Neusetzen von Passwörtern; fehlgeschlagene Anmeldeversuche usw.)
Name des auf dem zugreifenden Gerät installierten Betriebssystems
Name des verwendeten Browsers
Quellsystem, über das der Zugriff erfolgt ist
Die IP-Adresse des zugreifenden Geräts, wobei die letzten beiden Bytes vor dem ersten Speicherplatz maskiert werden (Beispiel: 192.168.xxx.xxx). Die abgekürzte IP-Adresse kann nicht mit dem zugreifenden Computer in Verbindung gebracht werden.
Eine Schätzung des Standorts des zugreifenden Clients auf der Grundlage der IP-Adresse

IV. Zweck der Datenverarbeitung

Wir verarbeiten personenbezogene Daten unserer Nutzer grundsätzlich nur soweit dies zur Bereitstellung einer funktionsfähigen Website sowie unserer Inhalte und Leistungen erforderlich ist.

Die Aufnahme von Nutzereingaben über unsere Website und die Verarbeitung von User-Inputs auf unserem HPC- System ist notwendig, um mit dem gewählten Image AI Service eine Antwort generieren zu können.

Die Speicherung in Logfiles erfolgt, um die Funktionalität der Website zu gewährleisten. Darüber hinaus helfen uns die Daten, die Website zu optimieren und die Sicherheit unserer IT-Systeme zu gewährleisten. Eine Nutzung der Daten zu Marketingzwecken findet in diesem Zusammenhang nicht statt.

Die Verarbeitung personenbezogener Daten unserer Nutzer erfolgt regelmäßig nur nach Einwilligung des Nutzers. Eine Ausnahme gilt in solchen Fällen, in denen eine vorherige Einholung einer Einwilligung aus tatsächlichen Gründen nicht möglich ist und die Verarbeitung der Daten durch gesetzliche Vorschriften gestattet ist.

V. Rechtsgrundlage für die Verarbeitung personenbezogener Daten

Soweit wir für Verarbeitungsvorgänge personenbezogener Daten eine Einwilligung der betroffenen Person einholen, dient Art. 6 Abs. 1 lit. a EU-Datenschutzgrundverordnung (DSGVO) als Rechtsgrundlage.

Bei der Verarbeitung von personenbezogenen Daten, die zur Erfüllung eines Vertrages, dessen Vertragspartei die betroffene Person ist, erforderlich ist, dient Art. 6 Abs. 1 lit. b DSGVO als Rechtsgrundlage. Dies gilt auch für Verarbeitungsvorgänge, die zur Durchführung vorvertraglicher Maßnahmen erforderlich sind.

Soweit eine Verarbeitung personenbezogener Daten zur Erfüllung einer rechtlichen Verpflichtung erforderlich ist, der unser Unternehmen unterliegt, dient Art. 6 Abs. 1 lit. c DSGVO als Rechtsgrundlage.

Für den Fall, dass lebenswichtige Interessen der betroffenen Person oder einer anderen natürlichen Person eine Verarbeitung personenbezogener Daten erforderlich machen, dient Art. 6 Abs. 1 lit. d DSGVO als Rechtsgrundlage.

Ist die Verarbeitung zur Wahrung eines berechtigten Interesses unseres Unternehmens oder eines Dritten erforderlich und überwiegen die Interessen, Grundrechte und Grundfreiheiten des Betroffenen das erstgenannte Interesse nicht, so dient Art. 6 Abs. 1 lit. f DSGVO als Rechtsgrundlage für die Verarbeitung.

VI. Aufbewahrungzeitraum und Unbedingt erforderliche Daten

Die Eingaben werden nur während des eigentlichen Inferenzprozesses auf dem GWDG-Server gespeichert. Nach Beendigung einer Sitzung im Browser sind die Eingaben des Nutzers nicht mehr verfügbar. Darüber hinaus wird ein Log angelegt, das die Anzahl der Anfragen pro Nutzer und die jeweiligen Zeitstempel enthält. Die Logs werden nach den Richtlinien der GWDG für ein Jahr gespeichert.

Die Sammlung von Daten für die Bereitstellung der Website und die Speicherung der Daten in Log-Files ist für den Betrieb der Website unbedingt erforderlich. Folglich gibt es keine Möglichkeit für den Nutzer zu widersprechen.

VII. Rechte der betroffenen Personen

Auskunftsrecht (DSGVO Art. 15, BDSG §34)

Recht auf Berichtigung (DSGVO Art. 16)

Recht auf Löschung / „Recht auf Vergessen werden“ / Recht auf Einschränkung der Verarbeitung (DSGVO Art. 17, 18, BDSG §35)

Recht auf Unterrichtung (DSGVO Art. 19)

Recht auf Datenübertragbarkeit (DSGVO Art. 20)

Widerspruchsrecht (DSGVO Art. 21, BDSG §36)

Sie haben das Recht, Widerspruch gegen die Verarbeitung einzulegen, wenn diese ausschließlich auf Basis einer Abwägung des Verantwortlichen geschieht (vgl. DSGVO Art. 6 Abs. 1 lit f).

Recht auf Widerruf der datenschutzrechtlichen Einwilligungserklärung (DSGVO Art. 7 Abs. 3)

Recht auf Beschwerde bei einer Aufsichtsbehörde (DSGVO Art. 77)

Landesbeauftragte für den Datenschutz Niedersachsen
Postfach 221, 30002 Hannover
E-Mail: poststelle@lfd.niedersachsen

Terms of use(English)

T&C’s - Terms of Use | Allgemeine Geschäftsbedingungen

This English version of the Terms of Use is provided here for your convenience. Please keep in mind that binding are only the Terms of Use in the German language.

Terms of Use for Image-AI

§ 1. Introduction

§ 1.1 Welcome to our Image-AI Service (the “Service”). By using the Service, you agree to comply with and be bound by the following terms and conditions ("Terms"). The Terms govern the business relationship between the platform operator (hereinafter referred to as the ‘Platform Operator’) and the users (hereinafter referred to as the ‘Users’) of ImageAI.

§ 1.2 Platform Operator is the GWDG; Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen.

§ 1.3 The Terms apply as of in the version valid at the time of registration of Users. The Platform Operator might update those Terms anytime in the future.

§ 1.4 Please read these Terms carefully before using the Service. By using the Service you aknowledge that you have read and agreed on these Terms.

§ 2. Service Description

§ 2.1 The Service refers to the Application Programming Interface (ImageAI API) and WebUI provided by the Platform Operator (GWDG) to signed-in Users of AcademicCloud as set out following.

§ 2.2 The Service provides Users with the ability to:

Generate images from prompts: Utilize AI technology to generate images from user input prompts.

§ 2.3 Service Specification: The Service is composed of the following parts:

Processing User Prompts and Parameters: This part processes prompts input by users along with their desired parameter settings for image generation.
Function: We provide a web server via the GWDG ESX service that hosts a WebUI. Users can log in via SSO and interact with the system through a user interface.

§ 3. Rights and obligations of users

§ 3.1 Users are entitled to use the platform within the scope of the usage permission and in accordance with these Terms of Use. Any use deviating from this requires a special permission.

§ 3.2 The Users are obliged to:

provide accurate and lawful user prompts for processing by the Service.
ensure that their use of Service does not violate any applicable laws, regulations, or rights of others.
ensure that their use of Service is pursuant to these Terms, in conjunction to the Terms of Use of Academic Cloud, where User’s specific rights and obligations are stipulated. The latter can be found here: AcademicCloud: Terms of Use.
not misuse the Service in any way that could harm, disable, overburden, or impair the Service.

§ 4. Prohibitions

§ 4.1 Users are prohibited from using this service for transmission, generating, and distributing content (input and output) that:

depicts child pornography or sexual abuse, including the falsification, deception or imitation thereof,
is sexually explicit and used for non-educational or non-scientific purposes,
is discriminatory, promotes violence, hate speech or illegal activities,
violates data protection laws, including the collection or dissemination of personal data without consent,
is fraudulent, misleading, harmful or deceptive,
promotes self-harm, harassment, bullying, violence and terrorism,
promotes illegal activities or violate intellectual property rights and other legal and ethical boundaries in online behavior,
attempts to circumvent our security measures or take actions that willfully violate established policies,
unfairly or adversely affects individuals, particularly with respect to sensitive or proprietary characteristics,
processes sensible data or confidential information where the legal requirements are not met.

§ 4.2 Users are prohibited from doing activities including:

Reverse engineering, decompiling, or disassembling the technology,
Unauthorized activities such as spamming, malware distribution, or disruptive behaviors that compromise service quality,
Modifying, copying, renting, selling, or distributing our service,
Engaging in tracking or monitoring individuals without their explicit consent.

§ 4.3 In case that any uncertainties occur regarding security or data protection within the service, please contact the data protection officer at the mailbox support@gwdg.de with the title “Datenschutz ImageAI”.

§ 4.4 For research purposes, usage scenarios under § 4.1 may be permitted in certain cases. Written agreements must be concluded between Users and the GWDG for such cases limited to specific purposes of use.

§ 5. AI Output

§ 5.1 By using the Service, Users are aware that the output is AI-generated. The Service uses advanced Artificial Intelligence (AI) technology to generate images from user prompts and process user instructions and parameters.

§ 5.2 However, the AI-generated outputs may not always be accurate, complete, or reliable. The Users acknowledge and agree that:

The AI function of image generation is intended to support users, but shall not be used on its own to accomplish critical tasks.
The accuracy of the AI-generated content may vary depending on factors such as the quality of the input, the complexity of the language, selected parameters and of instructions, and the context.
The risk of ‘Hallucinations’ is present in the Service, such as in most AI-Systems that perform generalized and various tasks. In this sense, the images generated by the AI may contain false or misleading information presented as facts.
The risk of bias also exists with the service. The outputs generated by the AI could be biased due to the training data.
Human oversight and control measures by Users are deemed necessary to ensure that the output is reliable and that it corresponds to your input prompt.

§ 6. Privacy and Data Security

§ 6.1 We are committed to protecting Users’ personal data. By using the service, Users consent to the collection, use and storage of data in accordance with our Privacy Notices.

§ 6.2 The GWDG Data Protection Notice as well as the Academic Cloud Privacy Notice can be found here:

§ 7. Intellectual Property

§ 7.1 The Platform Operator owns all intellectual property rights in the Service, including but not limited to software, algorithms, trade secrets and AI-generated content. Users are granted a limited, non-exclusive and non-transferable license to use the Service for the intended purposes.

§ 7.2 Users are required to adhere to copyright and proprietary notices and licenses, preventing the unauthorized distribution or reproduction of copyrighted content. The platform operator reserves the right to remove or block any content believed to infringe copyrights and to deactivate the accounts of alleged infringers.

§ 8. Liability of Users of Service

§ 8.1 The Users are liable for all damages and losses suffered by the Platform Operator as a result of punishable or unlawful use of the Service or of the authorization to use the Service, or through a culpable breach of the User’s obligations arising from these Terms.

§ 8.2 Users shall also be liable for damage caused by use by third parties within the scope of the access and usage options granted to them, insofar as they are accountable for this third-party use, in particular if they have passed on their login credentials to third parties.

§ 8.3 If the Platform Operator is held liable by third parties for damages, default or other claims arising from unlawful or criminal acts of the Users, the Users shall indemnify the Platform Operator against all resulting claims. The Platform Operator shall sue the Users if the third party takes legal action against the Platform Operator on the basis of these claims.

§ 8.4 Users shall be solely liable for the content they upload and generate themselves (User-generated-content) through the use of ImageAI. In this sense, Platform Operator does not bear any liability for violations of law occurring by such content.

§ 9. Liability of Platform Operator

§ 9.1 The Platform Operator does not guarantee that the Service will function uninterrupted and error-free at all times, neither expressly nor implicitly, and hereby rejects this. The loss of data due to technical faults or the disclosure of confidential data through unauthorized access by third parties cannot be excluded.

§ 9.2 The Platform Operator is not liable for the contents, in particular for the accuracy, completeness or up-to-date validity of information and data or of the output; it merely provides access to the use of this information and data.

§ 9.3 The Platform Operator maintains a mere technical, automatic and passive stance towards the content contributed by Users and does not play any active role in controlling, initiating or modifying that content and therefore cannot be held liable for cases where such content is unlawful.

§ 9.4 The Platform Operator shall only be liable in the event of intent or gross negligence by its employees, unless material duties are culpably breached, compliance with which is of particular importance for achieving the purpose of the contract (cardinal obligations). In this case, the liability of the Platform Operator is limited to the typical damage foreseeable at the time of conclusion of the mutual contract of use, unless there is intent or gross negligence.

§ 9.5 User claims for damages are excluded. Exempted from this are claims for damages by users arising from injury to life, physical integrity, health or from the breach of essential contractual obligations (cardinal obligations), as well as liability for other damages based on an intentional or grossly negligent breach of duty by the platform operator, its legal representatives or vicarious agents. Essential contractual obligations are those whose fulfillment is necessary to achieve the objective of the contract.

§ 10. Termination of Access

§ 10.1 The Platform Operator reserves the right to suspend or terminate the access of Users to the Service at any time, without notice, for any reason, including but not limited to the breach of these Terms.

§ 11. Final Provisions

§ 11.1 The Terms shall remain binding and effective in their remaining parts even if individual parts are legally invalid. The invalid parts shall be replaced by the statutory provisions, where applicable. However, if this would constitute an unreasonable burden for one of the contracting parties, the contract as a whole shall become invalid.

§ 11.2 Inquiries regarding these Terms can be communicated as following:

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Burckhardtweg 4
37077 Göttingen
Deutschland
Tel.: +49 551 39-30001
E-Mail: support@gwdg.de

By using the Service, you acknowledge that you have read, understood, and agree to be bound by these Terms.

Terms of use(German)

AGBs - Allgemeine Geschäftsbedingungen | Terms of Use

Our binding Terms of Use are in German. A non-binding English translation might be published soon for your convenience.

Allgemeine Geschäftsbedingungen für Image-AI

§ 1. Einführung

§ 1.1 Willkommen bei Image-AI-Dienst (der „Dienst“). Die folgenden Allgemeine Geschäftsbedingungen („Bedingungen“) regeln die Geschäftsbeziehung zwischen dem Plattformbetreiber (im Folgenden „Plattformbetreiber“) und den Nutzern (im Folgenden „Nutzer“) von ImageAI.

§ 1.2 Plattformbetreiber ist die GWDG; Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen.

§ 1.3 Die AGB gelten in der zum Zeitpunkt der Registrierung der Nutzer gültigen Fassung. Der Betreiber der Plattform kann diese Bedingungen jederzeit in der Zukunft aktualisieren.

§ 1.4 Bitte lesen Sie diese Bedingungen vor der Nutzung des Dienstes sorgfältig durch. Durch die Nutzung des Dienstes bestätigen Sie, dass Sie diese Bedingungen gelesen haben und mit ihnen einverstanden sind.

§ 2. Beschreibung des Dienstes

§ 2.1 Der Dienst bezieht sich auf die Anwendungsprogrammierschnittstelle (ImageAI API) und WebUI, die der Plattformbetreiber (GWDG) den angemeldeten Nutzern von AcademicCloud zur Verfügung stellt, wie im Folgenden dargelegt.

§ 2.2 Der Dienst bietet den Nutzern die Möglichkeit:

Bilder aus Prompts zu generieren: Nutzung von KI-Technologie zur Erstellung von Bildern auf Basis von Benutzeranweisungen.

§ 2.3 Spezifikation des Dienstes: Der Dienst besteht aus den folgenden Teilen:

Verarbeitung von Benutzeranweisungen und Parametern: Dieser Teil verarbeitet die vom Benutzer eingegebenen Prompts zusammen mit den gewünschten Parametereinstellungen für die Bilderzeugung.
Funktion: Wir stellen einen Webserver über den GWDG ESX-Dienst bereit, der eine WebUI hostet. Die Nutzer können sich über SSO anmelden und über eine Benutzeroberfläche mit dem System interagieren.

§ 3. Rechte und Pflichten der Nutzer

§ 3.1 Die Nutzer sind berechtigt, die Plattform im Rahmen der Nutzungserlaubnis und nach Maßgabe dieser Nutzungsbedingungen zu nutzen. Jede hiervon abweichende Nutzung bedarf einer besonderen Nutzungserlaubnis.

§ 3.2 Die Nutzer sind verpflichtet:

genaue und rechtmäßige Texteingaben für die Verarbeitung durch den Dienst bereitzustellen.
sicherzustellen, dass ihre Nutzung des Dienstes nicht gegen geltende Gesetze, Vorschriften oder Rechte Dritter verstößt.
sicherzustellen, dass die Nutzung gemäß den vorliegenden Bedingungen in Verbindung mit den Nutzungsbedingungen von Academic Cloud erfolgt, in denen ihre spezifischen Rechte und Pflichten festgelegt sind. Diese sind hier zu finden: AcademicCloud: Nutzungsbedingungen.
den Dienst nicht in einer Weise zu verwenden, die den Dienst schädigen, deaktivieren, überlasten oder beeinträchtigen könnte.

§ 4. Verbote

§ 4.1 Den Nutzern ist verboten, diesen Dienst zur Übertragung, Erzeugung und Verbreitung von Inhalten (Eingabe und Ausgabe) zu verwenden, die:

Kinderpornographie oder sexuellen Missbrauch darstellen, auch die Fälschung, Täuschung oder Nachahmung desselben,
sexuell explizit sind und für nicht-bildende oder nicht-wissenschaftliche Zwecke eingesetzt werden,
diskriminierend sind, Gewalt, Hassreden oder illegale Aktivitäten fördern,
Datenschutzgesetze verletzen, einschließlich der Sammlung oder Verbreitung personenbezogener Daten ohne Zustimmung,
betrügerisch, irreführend, schädlich oder täuschend sind,
Selbstverletzung, Belästigung, Mobbing, Gewalt und Terrorismus fördern,
illegale Aktivitäten fördern oder geistige Eigentumsrechte und andere rechtliche und ethische Grenzen im Online-Verhalten verletzen,
versuchen, unsere Sicherheitsmaßnahmen zu umgehen oder Handlungen zu veranlassen, die etablierte Richtlinien vorsätzlich verletzen,
Einzelpersonen, insbesondere in Bezug auf sensible oder geschützte Merkmale, ungerechtfertigt oder nachteilig beeinflussen könnten,
sensible Daten oder vertrauliche Information verarbeiten, insofern die rechtliche Rahmenbedingungen nicht erfüllt sind.

§ 4.2 Den Nutzern ist verboten, folgende Aktivitäten durchzuführen:

Reverse-Engineering, Dekompilierung oder Disassemblierung der Technologie;
Unautorisierte Aktivitäten wie Spamming, Malware-Verbreitung oder störende Verhaltensweisen, die die Dienstqualität beeinträchtigen;
Modifizierung, Kopie, Vermietung, Verkauf oder Verbreitung unseres Dienstes;
Nachverfolgung oder Überwachung von Einzelpersonen ohne deren ausdrückliche Zustimmung.

§ 4.3 Sollten Unsicherheiten bzgl. Sicherheit oder Datenschutz des Dienstes vorhanden sein, so bitten wir um Kontakt des Datenschutzbeauftragen unter dem Mailpostfach support@gwdg.de mit dem Titel “Datenschutz ImageAI”.

§ 4.4 Für Forschungszwecke könnten in bestimmten Fällen die Nutzungsszenarien in § 4.1 gestattet sein. Hierbei müssen schriftliche Absprachen zwischen den Nutzern und der GWDG für den Einsatszweck getroffen werden.

§ 5. KI Output

§ 5.1 Durch die Nutzung des Dienstes sind die Nutzer sich bewusst, dass die Ausgabe des Dienstes KI-generiert ist. Der Dienst nutzt fortschrittliche Technologie der Künstlichen Intelligenz (KI), um Bilder aus Prompts zu generieren und Benutzeranweisungen und Parametern zu verarbeiten.

§ 5.2 Die von der KI erzeugten Ergebnisse sind jedoch möglicherweise nicht immer akkurat, vollständig oder zuverlässig. Die Nutzer erkennen dies an und stimmen zu:

Die KI-Funktionen für Bilderzeugung dienen der Unterstützung der Nutzer, sollten aber nicht ausschließlich für kritische Aufgaben verwendet werden.
Die Genauigkeit der von der KI generierten Inhalte kann in Abhängigkeit von Faktoren wie der Qualität des Inputs, der Komplexität der Sprache und der gewählten Parameter, Anweisungen und dem Kontext variieren.
Wie bei den meisten KI-Systemen, die allgemeine und unterschiedliche Aufgaben erfüllen, besteht auch bei diesem Dienst die Gefahr von „Halluzinationen“. In diesem Sinne können die von der KI generierten Bilder falsche oder irreführende Informationen enthalten, die als Fakten dargestellt werden.
Das Risiko von Vorurteilen (Bias) besteht auch beim Dienst. Die von der KI generierten Outputs könnten aufgrund der Trainingsdaten voreingenommen sein.
Menschliche Überwachung und Kontrollen seitens der Nutzer werden als erforderlich gesehen, um sicherzustellen, dass die Outputs zuverlässig sind und den Nutzer-Inputs entsprechen.

§ 6. Datenschutz und Datensicherheit

§ 6.1 Wir verpflichten uns, die Daten der Nutzer zu schützen. Durch die Nutzung des Dienstes stimmen die Nutzer der Sammlung, Verwendung und Speicherung von Daten in Übereinstimmung mit unseren Datenschutzerklärungen zu.

§ 6.2 Die GWDG Datenschutzerklärung sowie Privacy Notice des AcademicCloud sind hier zu finden:

§ 7. Urheberrecht

§ 7.1 Der Plattformbetreiber ist Eigentümer aller geistigen Eigentumsrechte an dem Dienst, einschließlich, aber nicht beschränkt auf Software, Algorithmen, Geschäftsgeheimnisse und KI generierte Inhalte. Den Nutzern wird eine begrenzte, nicht exklusive und nicht übertragbare Lizenz zur Nutzung des Dienstes für die vorgesehenen Zwecke gewährt.

§ 7.2 Die Nutzer sind verpflichtet, die Urheberrechts- und Eigentumshinweise und Lizenzen zu beachten und die unerlaubte Verbreitung oder Vervielfältigung von urheberrechtlich geschützten Inhalten zu verhindern. Der Plattformbetreiber behaltet sich das Recht vor, Inhalte, von denen er annimmt, dass sie Urheberrechte verletzen, zu entfernen oder zu sperren und die Konten von mutmaßlichen Verstößen zu deaktivieren.

§ 8. Haftung der Nutzer

§ 8.1 Die Nutzer haften für alle Schäden und Nachteile, die dem Plattformbetreiber durch eine strafbare oder rechtswidrige Nutzung der Dienste oder der Nutzungsberechtigung oder durch eine schuldhafte Verletzung der Pflichten der Nutzer aus diesen Nutzungsbedingungen entstehen.

§ 8.2 Die Nutzer haften auch für Schäden, die durch die Nutzung durch Dritte im Rahmen der ihnen eingeräumten Zugangs- und Nutzungsmöglichkeiten entstehen, sofern sie diese Nutzung durch Dritte zu vertreten haben, insbesondere wenn sie ihre Benutzerkennung an Dritte weitergegeben haben.

§ 8.3 Wird der Plattformbetreiber von Dritten auf Schadensersatz, Verzug oder sonstige Ansprüche aus rechtswidrigen oder strafbaren Handlungen der Nutzerinnen in Anspruch genommen, so haben die Nutzerinnen den Plattformbetreiber von allen daraus resultierenden Ansprüchen freizustellen. Der Plattformbetreiber wird die Nutzer verklagen, wenn der Dritte den Plattformbetreiber aufgrund dieser Ansprüche gerichtlich in Anspruch nimmt.

§ 8.4 Die Nutzer haften allein für die Inhalte, die sie durch die Nutzung des Dienstes selbst erstellen und hochladen (User-generated-content).

§ 9. Haftung des Plattformbetreibers

§ 9.1 Der Plattformbetreiber übernimmt keine Gewähr dafür, dass die Plattform jederzeit unterbrechungs- und fehlerfrei funktioniert, weder ausdrücklich noch stillschweigend, und lehnt dies hiermit ab. Der Verlust von Daten infolge technischer Störungen oder die Offenlegung vertraulicher Daten durch unbefugten Zugriff Dritter kann nicht ausgeschlossen werden.

§ 9.2 Der Plattformbetreiber haftet nicht für Inhalte, insbesondere nicht für die Richtigkeit, Vollständigkeit oder Aktualität der Informationen und Daten; er vermittelt lediglich den Zugang zur Nutzung dieser Informationen und Daten.

§ 9.3 Der Plattformbetreiber nimmt gegenüber den von den Nutzern eingestellten Inhalten eine Tätigkeit rein technischer, automatischer und passiver Art ein und besitzt weder Kenntnis noch Kontrolle über die übermittelten oder gespeicherten Informationen, noch verändert er diese Informationen. Der Plattformbetreiber kann daher nicht haftbar sein für Fälle, in denen die Inhalte der Nutzer rechtswidrig sind.

§ 9.4 Im Übrigen haftet der Plattformbetreiber nur bei Vorsatz oder grober Fahrlässigkeit seiner Mitarbeitenden, es sei denn, es werden schuldhaft wesentliche Pflichten verletzt, deren Einhaltung für die Erreichung des Vertragszwecks von besonderer Bedeutung ist (Kardinalpflichten). In diesem Fall ist die Haftung des Plattformbetreibers auf den bei Abschluss des gegenseitigen Nutzungsvertrages vorhersehbaren typischen Schaden begrenzt, soweit nicht Vorsatz oder grobe Fahrlässigkeit vorliegt.

§ 9.5 Ansprüche der Nutzerinnen auf Schadensersatz sind ausgeschlossen. Hiervon ausgenommen sind Schadensersatzansprüche der Nutzerinnen aus der Verletzung des Lebens, des Körpers, der Gesundheit oder aus der Verletzung wesentlicher Vertragspflichten (Kardinalpflichten) sowie die Haftung für sonstige Schäden, die auf einer vorsätzlichen oder grob fahrlässigen Pflichtverletzung des Plattformbetreibers, seiner gesetzlichen Vertreter oder Erfüllungsgehilfen beruhen. Wesentliche Vertragspflichten sind solche, deren Erfüllung zur Erreichung des Ziels des Vertrags notwendig ist.

§ 10. Auflösung des Zugangs

§ 10.1 Der Plattformbetreiber behält sich das Recht vor, den Zugang von Nutzern zu dem Dienst jederzeit und ohne Vorankündigung aus irgendeinem Grund zu sperren oder zu löschen, unter anderem bei einem Verstoß gegen die vorliegenden Geschäftsbedingungen.

§ 11. Schlussbestimmungen

§ 11.1 Die Allgemeinen Geschäftsbedingungen bleiben auch bei rechtlicher Unwirksamkeit einzelner Punkte in ihren übrigen Teilen verbindlich und wirksam. Anstelle der unwirksamen Punkte treten, soweit vorhanden, die gesetzlichen Vorschriften. Soweit dies für eine Vertragspartei eine unzumutbare Härte darstellen würde, wird der Vertrag jedoch im Ganzen unwirksam.

§ 11.2 Nachfragen zu diesen Bedingungen können unter folgender Anschrift übermittelt werden:

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Burckhardtweg 4
37077 Göttingen
Deutschland
Tel.: +49 551 39-30001
E-Mail: support@gwdg.de

Durch die Nutzung des Dienstes bestätigen Sie, diese Bedingungen gelesen und verstanden zu haben, und erklären sich damit einverstanden, an sie gebunden zu sein.

JupyterHub

The JupyterHPC service offers a simple, interactive access to the HPC Cluster’s resources.

JupyterHPC is available to all HPC users who have an HPC Project Portal account. If you do not have an HPC Project Portal account, please consult Getting an Account and User Account Types.

For each computing project you get an extra username (which looks like u12345). You will be able to login to JupyterHPC with your AcademicCloud ID as soon as you have been added to a computing project. When logged in, you can select the the project and corresponding username you want to work with in the top-left corner of the JuypterHPC interface. The options follow the pattern “HPC Project (Username)”.

Troubleshooting

Info

If you contact GWDG support about JupyterHPC, please use one of the HPC specific support addresses or make it clear that you are talking about a HPC service, so that your request can be quickly and efficiently handled without a detour to the GWDG Jupyter Cloud team.

When you get an error message and JupyterHub does not start, check the home storage, a log of the container execution will be created in your HPC home directory in the current.jupyterhub.notebook.log file. Please refer to this file if you encounter problems with JupyterHPC and attach it to your email if you contact customer support. Please note that the settings you use to start your JupyterHPC session will be saved in your browser.

Also check that you are below the quota for your home directory. For calculations that need compute resources over a long duration, please submit a batch job to the appropriate slurm partition instead of using JupyterHPC.

JupyterHub is not suitable for exensive calculations. For calculations that need compute resources over a long duration, please submit a batch job to the appropriate slurm partition instead of using JupyterHPC (see documentation)

In some rare cases, for example if the underlying container freezes up or the node fails, it could be necessary to manually cancel the JupyterHPC Job. To do so, connect to the cluster via SSH, use squeue --me to find the job id of the JupyterHPC job, and then use scancel <jobid> to cancel the job.

Spawner Interface

HPC Type

JupyterHPC offers two categories of apps: Jupyter and Desktop.
Jupyter-type apps allow you to use the resources of the cluster as a web service, e.g. by directly running Juypter notebooks.
Desktop-type apps use VNC (Virtual Network Computing) to stream a full desktop environment over the web to your browser, and allow you to use the HPC resources as if it was your personal computer.

HPC Application

The HPC Application tray is visible if you have the HPC Type Juypter selected. Currently, JuypterHPC offers three applications:

JupyterLab: The well-known Juypter notebook interface. Allows you to run Juypter notebooks based on multiple kernels and programming languages. Currently, we support Python, R, Julia, bash and SageMath based notebooks.
RStudio: A web version of the popular RStudio IDE developed by Posit PBC. The application comes with a lot of R packages preinstalled.
Code IDE: The popular CodeServer IDE, now available directly on the cluster. Supports a large variety of programming languages and extensions for customization.

Desktop Environment

The Desktop Environment tray is visible if you have the HPC Type Desktop selected. Currently, JuypterHPC offers three desktop environments: Xfce, GNOME and KDE. All three start a desktop environment session on the HPC Cluster, which is forwarded to your browser via VNC (Virtual Network Computing). The sessions closely emulate a normal Linux Desktop computer and are intended for applications that require a graphical user interface to use (such as Matlab). The desktop environment will use GPU hardware acceleration on all applications that support it, if GPU is selected in the HPC Device tray.

HPC Device

With the HPC Device selector tray, you can select which node type your JupyterHPC session will use.

CPU: You will get a CPU-compute focused node, with a large amount of RAM and a large local NVMe disk available. This should be the default choice in most cases. For HPC Desktops, the CPU-compute focused node will use software rendering, which is fast enough for normal desktop usage in most cases.
GPU: You will get a GPU-compute focused node, with a NVidia Quaddro RTX 5000 GPU avialable. Please only select this if your job requires a GPU. The lower number of available GPUs could otherwise cause a large number of sessions to wait.

Advanced settings

Under advanced settings, you can make more precise configurations to your JupyterHPC session. We recommend using the default settings.

Job Duration: The time limit of your session. After the time limit is up, your session will be terminated and you will have to start a new one. Please remember that shorter time limits have a higher priority for the scheduler and may start sooner.
Number of CPU Cores / Memory: Select the amount of CPU Cores and main memory (RAM) for your session.
Reservation: If you have a Slurm reservation on the cluster that is authorized for JupyterHPC, you can enter it here.
Home directory: The directory your JuypterHPC session starts in.
Custom Container location: With this setting, you can use your own container within the JuypterHPC infrastructure, provided it has installed the required packages. Please refer to Create your own container for more information.

Creating your own JuypterHPC container

Users with more complex requirements can build their own Apptainer containers and load them in JupyterHPC to be able to access them on the web.

The following basic container recipe can be used as a starting point. It is important to keep the version of the jupyterhub package close to the one that is currently used on the hub (4.1.5).

Bootstrap: docker
From: condaforge/miniforge3

%post
    export DEBIAN_FRONTEND=noninteractive
    apt update
    apt upgrade -y

    conda install --quiet --yes \
        'ipyparallel' \
        'notebook=7.2.1' \
        'jupyterhub=4.1.5' \
        'jupyterlab=4.2.3'

    # Here you can add your own python packages
    conda install --quiet --yes \
        pandas \
        scipy \
        seaborn

A jupyterhub-singleuser binary that is compatible with the hub must be in the $PATH of the container for a successful startup. Otherwise you can extend the container as much as you want.

A more complex example that includes RStudio integration as well as PyTorch is shown below:

Bootstrap: docker
From: condaforge/miniforge3

%post
    export DEBIAN_FRONTEND=noninteractive 
    apt update 
    apt upgrade -y

    apt install -y --no-install-recommends software-properties-common dirmngr
    wget -qO- https://cloud.r-project.org/bin/linux/ubuntu/marutter_pubkey.asc | tee -a /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc
    add-apt-repository "deb https://cloud.r-project.org/bin/linux/ubuntu $(lsb_release -cs)-cran40/"
    add-apt-repository ppa:c2d4u.team/c2d4u4.0+
    apt update

    apt install -y \
        r-base \
        r-cran-caret \
        r-cran-crayon \
        r-cran-devtools \
        r-cran-forecast \
        r-cran-hexbin \
        r-cran-htmltools \
        r-cran-htmlwidgets \
        r-cran-plyr \
        r-cran-randomforest \
        r-cran-rcurl \
        r-cran-reshape2 \
        r-cran-rmarkdown \
        r-cran-rodbc \
        r-cran-rsqlite \
        r-cran-shiny \
        r-cran-tidyverse 

    apt install -y libclang-dev lsb-release psmisc sudo
    ubuntu_release=$(lsb_release --codename --short) 
    wget https://download2.rstudio.org/server/${ubuntu_release}/amd64/rstudio-server-2023.12.1-402-amd64.deb
    dpkg --install rstudio-server-2023.12.1-402-amd64.deb
    rm rstudio-server-2023.12.1-402-amd64.deb

    echo 'ftp_proxy=http://www-cache.gwdg.de:3128' >> /usr/lib/R/etc/Renviron.site
    echo 'https_proxy=http://www-cache.gwdg.de:3128' >> /usr/lib/R/etc/Renviron.site
    echo 'http_proxy=http://www-cache.gwdg.de:3128' >> /usr/lib/R/etc/Renviron.site

    echo '' >> /usr/lib/R/etc/Renviron.site
    conda install --quiet --yes \
        'ipyparallel=8.8.0' \
        'jupyter-rsession-proxy=2.2.1' \
        'notebook=7.2.1' \
        'jupyterhub=4.1.5' \
        'jupyterlab=4.2.3'
   
    conda install --quiet --yes \
        dgl \
        igraph \
        keras \
        pandas \
        pydot \
        scikit-learn \
        scipy \
        seaborn

For HPC Desktops, even the smallest example would consist of over 100 lines, thus they will not be repeated here. Please take your inspiration from the original recipes under /sw/viz/jupyterhub-nhr/jupyter-containers/

If you want to access Slurm from within your JupyterHPC container, please follow the build instructions here: Apptainer. The required bindings will be automatically added by the JupyterHPC system.

If you need more resources than provided by the Jupyter frontend, you will have to start an interactive job, start a container within it, and connect to it with an SSH tunnel. See our advanced Apptainer instructions for more information on this.

Neuromorphic Computing

Neuromorphic computing is an alternative way of computing, centered around the concept of the spiking neuron, inspired by the way biological neurons work. It can be used not only to perform simulations of nervous tissue, but also to solve constraint and graph optimization problems, run network simulations, process signals in real time, and perform various AI/ML tasks. Additionally, it is known to require lower energy consumption when compared to more traditional algorithms and computing architectures. For more information, please read the article in the January/February 2024 issue of GWDG News.

We host the neuromorphic computing platform SpiNNaker-2 as part of the Future Technology Platform. We also offer a suite of neuromorphic computing tools and libraries that can be run on standard CPU architectures. These can be used to learn about neuromorphic computing without the need for dedicated hardware, test and develop new neuromorphic algorithms, and even directly control neuromorphic hardware.

Neuromorphic Computing Tools and Libraries

To facilitate learning about neuromorphic computing and the development of new neuromorphic algorithms, we host a series of standard neuromorphic libraries and tools as a container. This container is located at /sw/container/neuromorphic/nmc-tools-and-libs.sif, and can be accessed through our JupyterHub as a custom container, or directly as a container on the cluster through Apptainer.

Here is a list and short description of the available tools. If you need new or updated tools, please contact us through our support channels.

Tool/Library	Description	Python based?
PyNN	PyNN is an API. It works as a frontend to set up and configure neuron networks, while the actual simulation is performed by a simulator in the backend. Bindings for many software simulators are available, such as Neuron, NEST, Brian2. Bindings for SpiNnaker-2 are currently a WIP. PyNN is the best starting point for learning about neuromorphic simulations, due to its simplicity and universality. In principle, networks designed with PyNN can be executed by any of the available simulators.	Yes
Neuron	Neuron is a simulator for spiking neuron networks. Compatible with PyNN.	No (but can be used from Python through PyNN)
NEST	NEST is a simulator for spiking neuron networks. Compatible with PyNN.	No (but can be used from Python through PyNN)
Brian2	Brian2 is a simulator for spiking neuron networks. Compatible with PyNN. (It is indeed called Brian, and not Brain!)	Yes
SNNTorch	A package for performing AI/ML tasks with spiking neurons, extending the well known PyTorch library.	Yes
NIR	An intermediate representation library that allows for transferring models between different neuromorphic software libraries and hardware architectures.	Yes
Norse	Neuromorphic AI/ML library.	Yes
Lava-DL	Neuromorphic AI/ML library. Associated with Intel’s Loihi neuromorphic hardware.	Yes

SpiNNaker

Introduction

SpiNNaker is the neuromorphic computing architecture from SpiNNcloud. The SpiNNaker architecture has been in development for a couple of decades, with different iterations. We currently host the recently developed SpiNNaker-2 version, and will eventually provide access to 4 boards.

SpiNNaker can be used to run typical spiking neuron neuromorphic simulations. This includes simulations of nervous tissue, AI/ML algorithms, graph, constraint, network and optimization problems, signal processing, and control loops. But, additionally, due to its flexible ARM based architecture, it can be made to run other non-neuromorphic simulations (for example, solving distributed systems of partial differential equations).

Finally, due to its sparse and efficient message passing architecture and efficient ARM-based cores, the hardware exhibits low power consumption when compared with similar more traditional configurations and algorithms.

For more information and reference material, please consult this article in the January/February 2024 issue of GWDG News.

Image of a pair of SpiNNaker-2 boards together with their control sister boards. — SpiNNaker boards

Architecture

The hardware architecture of SpiNNaker-2 is naturally quite complex, but it can be reduced to two main components relevant to the average end user:

An efficient message passing network with various network-on-a-chip level components. This is vital for efficient communication and routing between neurons that might reside in different chips or even different boards. Messages can even account for delay incurred during message transmission.
A large number of low-powered but efficient ARM cores, organized in a hierarchical, tiered configuration. It is important to understand this setup, because it can affect the structure and efficiency of your network. Each core has a limited number of neurons and synapses that it can fit (this number changes as models become more complex and require more memory and compute cycles). Naturally, communication is more efficient within a core than across cores in different boards. The following image shows the distribution of cores up to the multi-board level. Pay particular attention to the PE and QPE terminology ([Quad] Processing Element).

Schematic of the organization of SpiNNaker's core/processing element architecture. — SpiNNaker core architecture

How to Access

Access to SpiNNaker is currently possible through our Future Technology Platform (FTP). For this an account is required (usual GWDG account, or AcademicCloudID for external users), that then needs to be explicitly enabled to access the FTP. For more information, check our documentation on getting an account.

Access requests currently run through KISSKI (researchers and companies) and NHR (researchers at Universities only). Please consult their documentation and eventually request a project to test and utilize SpiNNaker. We also highly recommend you try to run some neuromorphic simulations with PyNN or other neuromorphic libraries first. If you have questions, you can also reach out through one of our support channels .

How to Use

The easiest way of approaching SpiNNaker simulations at the moment is through its Python library. This library is available through an open repository from SpiNNcloud. It is also provided on the FTP as an Apptainer container, which also includes the necessary device compilers targeting the hardware. The py-spinnaker2 library syntax is very similar to PyNN’s (and a PyNN integration is planned). For this reason, we recommend learning how to set up simulations with PyNN first, and testing your algorithms on PyNN plus a regular CPU simulator such as Neuron or NEST. See our page on neuromorphic tools for more information.

Container

The container is available at /sw/containers/spinnaker, from the FTP system only. You can start it up and use it as any other container. See the page for Apptainer for more information. Relevant libraries are to be found in /usr/spinnaker from inside the container. The container also includes some neuromorphic libraries such as PyNN, for convenience. Others can be included upon request. See also the neuromorphic tools container if you want to test some non-SpiNNaker libraries.

How a SpiNNaker simulation works

SpiNNaker functions as a heterogeneous accelerator. Working with this is in concept similar to a GPU, where the device has its own memory address space, and work proceeds in a batched approach. A simulation will look roughly like this:

Set up your neurons, networks, models, and any variables to be recorded.
Issue a Python command that submits your code to the device.
Your code gets compiled and uploaded to the board.
Board runs the simulation.
Once the simulation ends, you can retrieve the memory stored on the board.
You can continue working with the results returned by the board.

Notice that you don’t have direct control of the board while it is running the simulation (although I believe some callback functions might be available).

Communicating with the board

Currently, the hardware can be accessed from the log-in node of the FTP. This will be subject to change as our installation of the SpiNNaker hardware and the FTP itself evolve, and the boards will then only be accessible from proper, dedicated compute nodes that require going through a scheduler.

Tip

At present, you need to know the address of the board to communicate with it. This will be provided once you gain access to the FTP system.

py-spinnaker2: Code and Examples

You can find the py-spinnaker2 folder in /usr/spinnaker/py-spinnaker2 in the container. Examples are provided in the examples folder. Particularly useful are the folders snn and snn_profiling. Python code for the higher level routines available in /src.

Example 1: A minimal network

This example, taken from the basic network examples in the py-spinnaker2 repository, sets up a network with just 3 neurons. 2 input “neurons” that are just a spike source (they emit a spike at fixed, given times), one an excitatory and the other one an inhibitory pulse, and then a single LIF (leaky integrate and fire) neuron, whose voltage we track.

from spinnaker2 import hardware, snn

# Input Population
## create stimulus population with 2 spike sources
input_spikes = {0: [1, 4, 9, 11], 1: [20, 30] } # will spike at the given times
stim = snn.Population(size=2, neuron_model="spike_list", params=input_spikes, name="stim")

# Core Population
## create LIF population with 1 neuron
neuron_params = {
    "threshold": 10.0,
    "alpha_decay": 0.9,
    "i_offset": 0.0,
    "v_reset": 0.0,
    "reset": "reset_by_subtraction",
}

pop1 = snn.Population(size=1, neuron_model="lif",
                    params=neuron_params,
                    name="pop1", record=["spikes", "v"])

# Projection: Connect both populations
## each connection has 4 entries: [pre_index, post_index, weight, delay]
## for connections to a `lif` population:
##  - weight: integer in range [-15, 15]
##  - delay: integer in range [0, 7]. Actual delay on the hardware is: delay+1
conns = []
conns.append([0, 0, 4, 1])  # excitatory synapse with weight 4 and delay 1
conns.append([1, 0, -3, 2])  # inhibitory synapse with weight -3 and delay 1

proj = snn.Projection(pre=stim, post=pop1, connections=conns)

# Network
## create a network and add population and projections
net = snn.Network("my network")
net.add(stim, pop1, proj)

# Hardware
## select hardware and run network
hw = hardware.SpiNNaker2Chip(eth_ip="boardIPaddress")
timesteps = 50
hw.run(net, timesteps)

# Results
## get_spikes() returns a dictionary with:
##  - keys: neuron indices
##  - values: lists of spike times per neurons
spike_times = pop1.get_spikes()

## get_voltages() returns a dictionary with:
##  - keys: neuron indices
##  - values: numpy arrays with 1 float value per timestep per neuron
voltages = pop1.get_voltages()

The resulting LIF neuron voltage dynamic looks like:

2 input spike trains plus a LIF neuron simulation. — Neuron voltage

Running this code results in quite a bit of SpiNNaker specific output, which can tell you some details about your simulation. This is the output in this case, with some comments:

Used 2 out of 152 cores # 152 PEs per board
Sim duration: 0.05 s # As requested, 50 timesteps at 1 ms each
Actual duration (incl. offset): 0.060000000000000005
INFO:     open and bind UDP socket
INFO:     Running test ...
configure hardware
There are 1QPEs
Enabled QPEs.
INFO:     start random bus...
Enabled random bus
# Loading program into (Q)PEs
INFO:     QPE (1,1) PE 0): loading memory from file /s2-sim2lab-app/chip/app-pe/s2app/input_spikes_with_routing/binaries/s2app_arm.mem...
INFO:     QPE (1,1) PE 1): loading memory from file /s2-sim2lab-app/chip/app-pe/s2app/lif_neuron/binaries/s2app_arm.mem...
Loaded mem-files
Data written to SRAM
PEs started
Sending regfile interrupts for synchronous start
Run experiment for 0.06 seconds
Experiment done
# Reading back results from board's memory
Going to read 4000 from address 0xf049b00c (key: PE_36_log)
Going to read 4000 from address 0xf04bb00c (key: PE_37_log)
Going to read 500 from address 0xf04b00d0 (key: PE_37_spike_record)
Going to read 100 from address 0xf04b08a0 (key: PE_37_voltage_record)
Results read
interface free again
Debug size: 48
n_entries: 12
Log  PE(1, 1, 0)
magic = ad130ad6, version = 1.0
sr addr:36976

Debug size: 604
n_entries: 151
Log  PE(1, 1, 1)
magic = ad130ad6, version = 1.0
pop_table_info addr: 0x8838
pop_table_info value: 0x10080
pop_table addr: 0x10080
pop_table_info.address: 0x10090
pop_table_info.length: 2
pop table address 0: 
pop table address 1: 
pop table address 2: 
== Population table (2 entries) at address 0x10090 ==
Entry 0: key=0, mask=0xffffffff, address=0x100b0, row_length=4
Entry 1: key=1, mask=0xffffffff, address=0x100c0, row_length=4
global params addr: 0x10060
n_used_neurons: 1
record_spikes: 1
record_v: 1
record_time_done: 0
profiling: 0
reset_by_subtraction: 1
spike_record_addr: 0x8854
Neuron 0 spiked at time 13

Read spike record
read_spikes(n_neurons=1, time_steps=50, max_atoms=250)
Read voltage record
read_voltages(n_neurons=1, time_steps=50)
Duration recording: 0.0030603408813476562
{0: [13]}
{0: array([ 0.        ,  0.        ,  0.        ,  4.        ,  3.6   ,
        3.2399998 ,  6.916     ,  6.2243996 ,  5.601959  ,  5.0417633 ,
        (...)
       -1.1239064 , -1.0115157 , -0.91036415, -0.8193277 , -0.7373949 ],
      dtype=float32)}

Tip

Notice that the simulation also generates a pair of human-readable files in JSON format (spec.json and results.json). These contain the data retrieved from the SpiNNaker board, including for example neuron voltages, and can be used for post-processing, without needing to rerun the whole calculation.

Example 2: Testing limits

The SpiNNaker library has a number of Exception errors that you can catch. The following example shows how to test the maximum number of synapses/connections between 2 populations, that can be fit in SpiNNaker:

from spinnaker2 import hardware, snn


def run_model(pops=2, nneurons=50, timesteps=0):
    # Create population
    neuron_params = {
        "threshold": 100.0,
        "alpha_decay": 0.9,
    }

    pops_and_projs=[]
    for ipop in range(pops):
        pop=snn.Population(size=nneurons, neuron_model="lif", params=neuron_params, name="pop{}".format(ipop))
        # Limit the number of neurons of this population in a given PE/core
        # pop.set_max_atoms_per_core(50) #tradeoff between neurons per core and synapses per population
        pops_and_projs.append(pop)

    # Set up synpases
    w = 2.0  # weight
    d = 1  # delay
    ## All to all connections
    conns = []
    for i in range(nneurons):
        for j in range(nneurons):
            conns.append([i, j, w, d])

    for ipop in range(pops-1):
        proj=snn.Projection(pre=pops_and_projs[ipop], post=pops_and_projs[ipop+1], connections=conns)
        pops_and_projs.append(proj) #adding the projections to the pop list too for convenience

    # Put everything into network
    net = snn.Network("my network")
    net.add(*pops_and_projs) #unpack list

    # Dry run with mapping_only set to True, we don't even need board IP
    hw = hardware.SpiNNaker2Chip()
    hw.run(net, timesteps, mapping_only=True)


def sweep(pops=2, nneurons=[1, 10, 50, 100, 140, 145, 150, 200, 250]):
    # max number of pre neurons with 250 post neurons
    timesteps = 0
    best_nneurons = 0
    for nn in nneurons:
        try:
            run_model(pops=pops, nneurons=nn, timesteps=timesteps)
            best_nneurons = nn
        except MemoryError:
            print(f"Could not map network with {nn} neurons and {pops} populations")
            break
    max_synapses = best_nneurons**2
    max_total_synapses = max_synapses*(pops-1) # 1 projection for 2 pops, 2 projections for 3 pops, etc
    return [best_nneurons, max_synapses, max_total_synapses]

pops=10
max_synapses = sweep(pops=pops)
print("Testing max size of {} populations of N LIF neurons, all to all connections (pop0 to pop1, pop1 to pop2, etc.)".format(pops))
print("Max pre neurons:             {}".format(max_synapses[0]))
print("Max synapses between 2 pops: {}".format(max_synapses[1]))
print("Max synapses total network : {}".format(max_synapses[2]))

When you run this, notice that:

The number of PEs used depends on the number of populations created.
These populations are connected with all-to-all synapses, which scales very rapidly. The maximum number of synapses that will fit if a population is restricted to 1 core is around 10-20k.
You can change the number of neurons of a population that are inside a core by uncommenting the pop.set_max_atoms_per_core(50) line. There is a tradeoff between the number of cores/PEs used for the simulation, and the maximum number of synapses that can be mapped.
From these results, it is better to run simulations with sparse connections, or islands of densely connected populations connected to one another sparsely. Large dense networks are also possible, but with tradeoffs with regard to the number of cores utilized (and related increased communication times, power consumption, etc.)

Dry Run and other interesting options

You can do a dry run without submitting a simulation to the board by using the mapping_only option:

hw = hardware.SpiNNaker2Chip() # Notice: No board IP required.
hw.run(net, timesteps, mapping_only=True)

This still lets you test the correctness of your code and even some py-spinnaker routines. See for example the code in the snn_profiling folder in py-spinnaker2 examples folder. Other interesting options of the hardware object:

- mapping_only (bool): run only mapping and no experiment on hardware. default: False
- debug(bool): read and print debug information of PEs. default: True
- iterative_mapping(bool): use a greedy iterative mapping to automatically solve MemoryError (maximizing number of neurons per core). default: False

Currently Known Problems

If you can’t seem to import the py-spinnaker2 library in Python, it is possible that your environment is using the wrong Python interpreter. Make sure that your PATH variable is not pointing to some paths outside of the container (for example, some conda installation in your home folder).

External Documentation

Documentation directly from SpiNNcloud:

Current limitations

At the moment, only the Python library py-spinnaker2 is available. In the future users should also gain access to the lower level C code defining the various neurons, synapses, and other simulation elements, so they can create custom code to be run on SpiNNaker.

There are presently some software based limitations to the number of SpiNNaker cores that can be used in a simulation, the possibility of setting up dynamic connections in a network, and possibly others. These will become available as the software for SpiNNaker is further developed. Keep an eye out on updates to the code repositories.

Protein-AI

Info

This service will be available soon.

Experience cutting-edge protein structure prediction with our AI-powered service. Utilizing High-Performance Computing (HPC) and AlphaFold2/Boltz models, our platform offers swift, accurate, and reliable automatic structure predictions for diverse proteins. Enhanced with MMseqs2, the service rivals traditional methods. Accessible for free via the KISSKI platform, it caters to various prediction tasks, making advanced protein structure prediction available to all researchers and scientists.

Key Features of Protein AI

Speed: Uses MMseqs2 for rapid MSA generation
Accuracy Accurate structure prediction for single protein sequences
User-Friendly: Simple web-based interface

Users need to:

Input the protein sequence in FASTA format.
The predicted structure will be displayed in the result box and available for download.

This service will transform structural biology research, making protein structure prediction more accessible and efficient for researchers across various fields.

Ensuring Privacy and Flexibility

Security is essential when dealing with potentially sensitive biological data, as it provides reliability, demonstrates compliance during audits or regulatory inspections, and ensures research integrity. User privacy is a cornerstone of this service. We record only the number of requests per user and associated timestamps, thereby ensuring that all protein sequences and predicted structures remain private.

Web interface and usage

If you have an AcademicCloud account, the Protein AI web interface can be easily accessed here.

From the web interface, there are built-in actions that need to be filled for the service to be functional. These include:

Single Protein Structure Prediction:

Input sequence: Paste your protein sequence in FASTA format.
Run submit: Click to start the prediction process.
Download results: Download the output files.
Show results: View the predicted structure.
Light/Dark mode (sun/moon button): Toggle between light and dark mode.
Footer: Includes “Privacy Policy”, “Terms of use”, “FAQ”, Contact, and the option to switch between English and German.

Protein-AI FAQ

Data Privacy

Are my protein sequences or usage data used for AI training or similar purposes?

No, your protein sequences and prediction data are not used to train any AI models.

Are my sequences and predicted structures stored on your servers at any stage?

User protein sequences and predicted structures are stored temporarily (for 30 days) on the GWDG server. During this period, only the respective user has access to their data. At no point do we access it on our servers without user permission.

What data does Protein AI keep when I use the service?

We do keep protein sequences and predicted structures on our GWDG server for 30 days. We record some usage statistics to monitor the load on our service and improve the user experience. This includes usernames, timestamps, and the services that were requested.

Availability

My institution is interested in using Protein AI. Can we advertise it to our users? Would you be able to handle an additional load for XXX users?

For large institutions, please contact us directly at info@kisski.de.

Are Protein AI services free?

Protein AI services that are accessible to a user with an AcademicCloud account are for free.

Data Privacy Notice

Note that this document in provided for supporting English-speaking users, the legally binding document is the German document.

Data Processor

The responsible party for data processing within the meaning of Art. 4 No. 7 GDPR and other national data protection laws of the member states as well as other data protection regulations is the:

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Burckhardtweg 4
37077 Göttingen
Göttingen, Germany
Tel: +49 (0) 551 39-30001
E-mail: support@gwdg.de
Website: www.gwdg.de

Represented by the managing director. The controller is the natural or legal person who alone or jointly with others determines the purposes and means of the processing of personal data.

Contact person / Data protection officer

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Datenschutzbeauftragter
Burckhardtweg 4
37077 Göttingen
Göttingen, Germany
Phone: +49 (0) 551 39-30001
E-mail: support@gwdg.de

General information on data processing

Overview of the service

The Protein AI service consists of several components, particularly a web frontend and alphafold2/boltz models in the backend. The frontend provides users with a web interface to directly enter query sequences via a browser. Additionally, users can select their desired model and adjust certain settings. The frontend forwards all requests to the selected model backend. The backend is hosted via the GWDG platform, which receives all requests and forwards them to the appropriate model.

Scope of the processing of personal data

Legal basis for the processing of personal data

Insofar as the processing of personal data is necessary for compliance with a legal obligation to which our company is subject, Article 6 (1) lit. (c) GDPR is the legal basis.

Where the processing of personal data is necessary in order to protect the vital interests of the data subject or another natural person, the legal basis is Article 6 (1) lit. (d) GDPR.

Use of the Protein-AI website (frontend)

Description and scope of data processing

Each time https://protein-ai.academiccloud.de/ is accessed, the system automatically collects data and information from the computer system of the accessing computer. The following data is collected in each case:

Date of access
Name of the operating system installed on the accessing device
Name of the browser used
Source system via which the access was made
The IP address of the accessing device

The data is also stored in the log files of our system. This data is not stored together with other personal data of the user. After users query sequences get processed by backend model, they get deleted and only output result will be kept on GWDG’s secure data mover node for 30 days.

General use of models

Description and scope of data processing

For billing purposes, the following data is stored and logged on the GWDG server for each request:

Date of the request
user ID
Time of the GPU use

This data is also stored in the log files of our system. This data is not stored together with other personal data of the user. No liability can be accepted for the automatically generated results. Results may be completely incorrect or contain incorrect partial information.

Duration of storage

The billing data is stored for one year.

Use of self-hosted models

Description, Duration of storage, and scope of data processing

In order to use the models hosted by the GWDG, the user’s inputs/sequences are processed on the GWDG’s systems. Protecting the privacy of user requests is of fundamental importance to us. For this reason, our service in combination with the self-hosted models does not store the inputs/sequences of the requests, and the output is kept on GWDG data mover node for 30 days (for user convenience to have ample time to download the results). Note on Output Data: While the output (e.g., predicted protein structure) is not considered personal data under GDPR, it may contain sensitive biological information. Users are responsible for ensuring compliance with applicable laws when using or sharing results.

Rights of data subjects

Datenschutzhinweis

Hinweis: Dieses Dokument wird für englischsprachige Nutzer zur Verfügung gestellt. Das rechtlich verbindliche Dokument ist die deutsche Fassung.

Verantwortlicher für die Datenverarbeitung

Der Verantwortliche im Sinne von Art. 4 Nr. 7 DSGVO und weiterer nationaler Datenschutzgesetze der Mitgliedstaaten sowie sonstiger datenschutzrechtlicher Bestimmungen ist:

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Burckhardtweg 4
37077 Göttingen
Deutschland
Tel: +49 (0) 551 39-30001
E-Mail: support@gwdg.de
Website: www.gwdg.de

Vertreten durch den Geschäftsführer.
Der Verantwortliche ist die natürliche oder juristische Person, die allein oder gemeinsam mit anderen über die Zwecke und Mittel der Verarbeitung personenbezogener Daten entscheidet.

Ansprechpartner / Datenschutzbeauftragter

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Datenschutzbeauftragter
Burckhardtweg 4
37077 Göttingen
Deutschland
Tel: +49 (0) 551 39-30001
E-Mail: support@gwdg.de

Allgemeine Informationen zur Datenverarbeitung

Übersicht über den Dienst

Der Protein-AI-Dienst besteht aus mehreren Komponenten, insbesondere einem Web-Frontend sowie den Backend-Modellen alphafold2/boltz. Das Frontend bietet den Nutzern eine Weboberfläche, um direkt Abfragesequenzen über einen Browser einzugeben. Zudem können Nutzer das gewünschte Modell auswählen und bestimmte Einstellungen anpassen.
Das Frontend leitet alle Anfragen an das ausgewählte Backend-Modell weiter.
Das Backend wird über die GWDG-Plattform betrieben, die sämtliche Anfragen empfängt und an das entsprechende Modell weiterleitet.

Umfang der Verarbeitung personenbezogener Daten

Wir verarbeiten personenbezogene Daten unserer Nutzer nur insoweit, wie dies zur Bereitstellung einer funktionsfähigen Website sowie unserer Inhalte und Dienste erforderlich ist.
Die Verarbeitung personenbezogener Daten unserer Nutzer erfolgt regelmäßig nur mit deren Einwilligung (Art. 6 Abs. 1 lit. a DSGVO). Eine Ausnahme gilt in Fällen, in denen eine vorherige Einholung der Einwilligung aus tatsächlichen Gründen nicht möglich ist und die Verarbeitung der Daten gesetzlich erlaubt ist.

Rechtsgrundlage für die Verarbeitung personenbezogener Daten

Erteilen betroffene Personen ihre Einwilligung zu Verarbeitungsvorgängen personenbezogener Daten, so dient Art. 6 Abs. 1 lit. a DSGVO als Rechtsgrundlage.

Ist die Verarbeitung personenbezogener Daten zur Erfüllung eines Vertrags erforderlich, dessen Vertragspartei die betroffene Person ist, so dient Art. 6 Abs. 1 lit. b DSGVO als Rechtsgrundlage. Dies gilt auch für Verarbeitungsvorgänge, die zur Durchführung vorvertraglicher Maßnahmen erforderlich sind.

Ist die Verarbeitung zur Erfüllung einer rechtlichen Verpflichtung erforderlich, der unser Unternehmen unterliegt, so dient Art. 6 Abs. 1 lit. c DSGVO als Rechtsgrundlage.

Ist die Verarbeitung erforderlich, um lebenswichtige Interessen der betroffenen Person oder einer anderen natürlichen Person zu schützen, so ist Art. 6 Abs. 1 lit. d DSGVO die Rechtsgrundlage.

Ist die Verarbeitung zur Wahrung eines berechtigten Interesses unseres Unternehmens oder eines Dritten erforderlich und überwiegen nicht die Interessen oder Grundrechte und Grundfreiheiten der betroffenen Person, so dient Art. 6 Abs. 1 lit. f DSGVO als Rechtsgrundlage.

Nutzung der Protein-AI-Website (Frontend)

Beschreibung und Umfang der Datenverarbeitung

Bei jedem Zugriff auf https://protein-ai.academiccloud.de/ erhebt das System automatisch Daten und Informationen vom Computersystem des zugreifenden Geräts.
Folgende Daten werden jeweils erhoben:

Datum des Zugriffs
Name des auf dem zugreifenden Gerät installierten Betriebssystems
Name des verwendeten Browsers
Quellsystem, über das der Zugriff erfolgt ist
IP-Adresse des zugreifenden Geräts

Die Daten werden zudem in den Logdateien unseres Systems gespeichert. Eine Speicherung zusammen mit anderen personenbezogenen Daten des Nutzers erfolgt nicht.
Nach der Verarbeitung der vom Nutzer eingegebenen Sequenzen durch das Backend-Modell werden diese gelöscht; es wird lediglich das Ausgaberesultat auf dem gesicherten „Data Mover“-Knoten der GWDG für 30 Tage aufbewahrt.

Allgemeine Nutzung der Modelle

Beschreibung und Umfang der Datenverarbeitung

Zu Abrechnungszwecken werden auf dem GWDG-Server für jede Anfrage folgende Daten gespeichert und protokolliert:

Datum der Anfrage
Nutzer-ID
Dauer der GPU-Nutzung

Diese Daten werden auch in den Logdateien unseres Systems gespeichert. Eine Speicherung zusammen mit anderen personenbezogenen Daten des Nutzers erfolgt nicht.
Für automatisch erstellte Ergebnisse kann keine Haftung übernommen werden. Ergebnisse können vollständig oder teilweise fehlerhaft sein.

Dauer der Speicherung

Die Abrechnungsdaten werden für ein Jahr gespeichert.

Nutzung selbst gehosteter Modelle

Beschreibung, Dauer der Speicherung und Umfang der Datenverarbeitung

Zur Nutzung der von der GWDG gehosteten Modelle werden die vom Nutzer übermittelten Eingaben/Sequenzen auf den Systemen der GWDG verarbeitet. Der Schutz der Privatsphäre der Nutzerdaten ist für uns von grundlegender Bedeutung.
Daher speichert unser Dienst in Kombination mit den selbst gehosteten Modellen die Eingaben/Sequenzen der Anfragen nicht. Die Ausgabe wird auf dem GWDG „Data Mover“-Knoten für 30 Tage gespeichert (damit der Nutzer ausreichend Zeit hat, die Ergebnisse herunterzuladen).

Hinweis zu Ausgabedaten: Während die Ausgaben (z. B. vorhergesagte Proteinstruktur) nach DSGVO nicht als personenbezogene Daten gelten, können sie sensible biologische Informationen enthalten. Nutzer sind selbst dafür verantwortlich, bei Verwendung oder Weitergabe der Ergebnisse die geltenden Gesetze einzuhalten.

Rechte der betroffenen Personen

Sie haben verschiedene Rechte in Bezug auf die Verarbeitung Ihrer personenbezogenen Daten. Diese werden im Folgenden aufgeführt. Weiterführende Informationen finden Sie in den entsprechenden Artikeln der DSGVO und/oder Paragraphen des BDSG (2018).

Recht auf Löschung / „Recht auf Vergessenwerden“ / Recht auf Einschränkung der Verarbeitung (Art. 17/18 DSGVO; § 35 BDSG)

Sie haben das Recht, vom Verantwortlichen die unverzügliche Löschung Ihrer personenbezogenen Daten zu verlangen. Alternativ können Sie verlangen, dass die Verarbeitung eingeschränkt wird, wie in den genannten Artikeln und Paragraphen ausgeführt.

Recht auf Datenübertragbarkeit (Art. 20 DSGVO)

Sie haben das Recht, die personenbezogenen Daten, die Sie dem Verantwortlichen bereitgestellt haben, in einem strukturierten, gängigen und maschinenlesbaren Format zu erhalten.
Die Übertragbarkeit von Massendaten/Nutzerdaten ist jedoch auf die technische Lesbarkeit beschränkt. Der Verantwortliche ist nicht verpflichtet, vom Nutzer erstellte Daten in ein standardisiertes Format zu konvertieren.

Recht auf Beschwerde bei einer Aufsichtsbehörde (Art. 77 DSGVO)

Unbeschadet eines anderweitigen verwaltungsrechtlichen oder gerichtlichen Rechtsbehelfs haben Sie das Recht, Beschwerde bei einer Aufsichtsbehörde einzulegen, insbesondere im Mitgliedstaat Ihres gewöhnlichen Aufenthaltsorts, Ihres Arbeitsplatzes oder des Ortes der mutmaßlichen Verletzung, wenn Sie der Ansicht sind, dass die Verarbeitung Ihrer personenbezogenen Daten gegen die DSGVO verstößt.

Terms of use

Registration and Access

Access to this service requires an Academic Cloud ID. Having an Academiccloud ID is subject to acceptance of “Academic Cloud Terms of Use.” Furthermore, for signing up for an account to use our service, you must provide accurate and thorough information. Sharing your account details or allowing others to use your account is not permitted, and you are responsible for any activities conducted under your account. If you create an account or use this service on behalf of another person or entity, you must have the authority to accept these Terms on their behalf.

Authorized usage

Users are obligated to employ the technology or services solely for authorized and lawful purposes, ensuring compliance with all applicable laws, regulations, and the rights of others, encompassing national, federal, state, local, and international laws.

Development

You recognize that we may be developing or acquiring similar software, technology, or information from other sources. This acknowledgment does not impose limitations on our development or competitive endeavors.

Prohibitions

Users are prohibited from using this service for transmission, generating, and distributing content (input and output) that:
- contains confidential or sensitive information;
- violates privacy laws, including the collection or distribution of personal data without consent;
- Is fraudulent, deceptive, harmful, or misleading;
- Is discriminative, promotes violence, hate speech, or illegal activities;
- encourages self-harm, harassment, bullying, violence, and terrorism;
- Is sexually explicit for non-educational or non-scientific purposes, involves child sexual exploitation, misrepresentation, deception, or impersonation;
- promotes illegal activities or infringes on intellectual property rights and other legal and ethical boundaries in online activities;
- Involves any sensitive or controlled data, such as protected health information, personal details, financial records, or research involving sensitive human subjects;
- attempting to bypass our safety measures or prompting actions that violate established policies intentionally;
- could unfairly or adversely impact individuals, particularly concerning sensitive or protected characteristics;
Users are prohibited from doing activities including:
- Reverse engineering, decompiling, or disassembling the technology;
- Unauthorized activities such as spamming, malware distribution, or disruptive behaviors that compromise service quality;
- Modifying, copying, renting, selling, or distributing our service;
- Engaging in tracking or monitoring individuals without their explicit consent.

Termination and suspension

You can terminate your use of our Services and your legal relationship with us at any time by stopping the use of the service. If you’re an EEA-based consumer, you have the right to withdraw these Terms within 14 days of acceptance by contacting Support. We reserve the right to suspend or terminate your access to our service or deactivate your account for violating these Terms or other terms and policies referred to here by you, for the necessity of compliance with the law, or for posing risks or harm to us, users, and others through using. We’ll give you notice before deactivating your account, unless it’s not feasible or allowed by law. If you believe your account has been suspended or disabled mistakenly, you can appeal by contacting Support. We reserve the right of legal action to safeguard intellectual property rights and user safety. Civil penalties, damages, administrative fines, criminal charges, or other legal options could be pursued in violation of these terms or by engaging in illegal activities through using this service.

Accuracy

The results generated by our services may not always be unique, and entirely correct or precise. It could contain inaccuracies, even if they seem detailed. Users shouldn’t solely depend on these results without verifying their accuracy independently. Moreover, the protein structure prediction provided by our services might not always be complete or accurate. Therefore, users should exercise caution and avoid using the services alone for important decisions. It’s crucial to understand that AI and machine learning are constantly evolving, and although we strive to improve the accuracy and reliability of our services, users should always assess the accuracy of the results and ensure they meet their specific requirements, verifying them with human input before use or distribution. Additionally, users should refrain from using results related to individuals for purposes that could significantly affect them, such as legal or financial decisions. Lastly, users should be aware that incomplete or inaccurate results may occur, which may not necessarily reflect the views of GWDG or its affiliated parties.

Copyright

We own all rights, titles, and interests in and to the service. Users are required to uphold copyright and other proprietary notices, preventing the unauthorized distribution or reproduction of copyrighted content. We reserve the right to remove or block any content believed to infringe on copyright and to deactivate the accounts of repeat offenders.

Feedbacks

We appreciate your feedback on our services and products and encourage you to share your thoughts to help us improve them. By providing feedback, you understand that we may disclose, publish, exploit, or use it to enhance our offerings without any obligation to compensate you. We reserve the right to use feedback for any purpose without being restricted by confidentiality obligations, whether it is marked confidential or not.

Privacy

The privacy of user requests is fundamental to us. User protein sequences and predicted structures are stored temporarily (for 30 days) on the GWDG server. During this period, only the respective user has access to their data. At no point do we access it on our servers without user permission. The number of requests for our services per user and the respective timestamps are recorded so we can monitor the system’s usage and perform accounting.

For technical purposes, the following data is collected by the webserver:

Date of access
Name of the operating system installed on the accessing device
Name of the browser used
Source system via which the access was made
The IP address of the accessing device

The data is also stored in our system’s log files. This data is not stored together with the user’s other personal data. You can find more detailed information regarding data protection in the privacy policy that users shall accept when using this service.

Business Users and Organizations

If the users are businesses or organizations, including public and government agencies, they shall indemnify GWDG and its affiliates for all claims and legal actions arising from their misuse of this service or violation of these terms. This indemnification encompasses all associated costs and liabilities, such as claims, losses, damages, judgments, fines, litigation expenses, and legal fees. In this case, GWDG will inform the other party in writing upon becoming aware of any claim and leave the defense at businesses or organizations or will cooperate reasonably in the defense or investigation of the claim.

Quantum Computing Simulators

The ability of quantum computing is expanding day by day, and quantum computing is making the execution of some computational tasks possible, such as:

Simulating quantum systems (e.g., protein folding, molecular dynamics, and so on).
Optimization problems (e.g., traveling salesman, maximum cut, Grover’s search, and so on).
Cryptography (e.g., network security).
Machine learning (e.g., classifiers).

Quantum simulators are one of a kind for understanding the capability and diversity of quantum computers. Quantum simulators help us understand the logic behind quantum computing with various applications and how to operate and integrate them into our skill set.

Info

As we are in the Noisy Intermediate Scale Quantum (NISQ) era of quantum computing, many quantum computing algorithms and procedures exhibit exponential scaling, which can quickly become computationally intensive. Even for a small number of qubits, therefore, simulations can consume significant resources and time. The computational time that can drastically change (e.g., from 1 second to 1 hour) with a small increase in the qubit number (e.g., 10).

List of Simulators

The Quantum simulators can be used to test your own quantum circuits and algorithms on our HPC systems.

The following links provide an introductory-level document of the various quantum simulators we provide:

Each simulator has advantages, making it possible to choose the best simulator for your task.

Qiskit is developed by IBM and is currently the most widely adopted Software Development Kit (SDK) with constant updates, many methods and tutorials, and a good simulator (Qiskit-aer) which can be executed on CPUs and GPUs.
Qulacs is a very fast simulator across the board, can be executed on GPUs, and can optimise your circuits for even faster execution.
Cirq is developed by Google and can simulate circuits executed on real-world qubit architectures. It also has multiple simulators included in its toolkit, e.g., Qsim, which is faster than the default Cirq simulator and provides multi-threading capabilities.
QuTip is designed with a focus on physics applications, and thus has a massive library of methods for that goal.
Qibo is a large open-source quantum ecosystem with many applications and tutorials to explore. It is a decently fast simulator to boot on CPUs and GPUs.

Quantum Simulator	Core Advantages	Applications	Simulator of Choice for	Package Dependencies
Qiskit Aer (CPU & GPU)	High-performance simulation for IBM Qiskit circuits, Supports GPU acceleration with CUDA, Integration with Qiskit ecosystem, Noise modeling for realistic simulations, Suitable for variational algorithms and benchmarking	Statevector, unitary, noise simulation and error mitigation	General user, educators	`qiskit`, `qiskit-aer`, `qiskit-algorithms`, `qiskit-machine-learning`, `qiskit-nature`, `qiskit-optimization`, `qiskit-finance`, `qiskit-dynamics`, `qiskit-ibm-runtime`
Qulacs (CPU & GPU)	Highly optimized for CPU with multi-threading and very fast gate operations, GPU acceleration for large-scale circuits, Designed for performance, Flexible API for circuit optimization, Supports Python and C++	GPU acceleration, hybrid classical-quantum algorithms	Optimization researchers	`qulacs`, `qulacs-gpu`
Cirq & Qsim	User-friendly for Google’s quantum hardware simulations, Qsim optimized for speed with GPU support (less efficient than Qulacs), Excellent for NISQ device simulations, Supports tensor network simulations, Ideal for Google-like hardware setups	Integration with Google’s quantum hardware, custom noise models	Neural network researchers with tensorflow	`cirq`, `qsimcirq`
QuTip	Focused on open quantum systems simulation with a rich library for Lindblad and master equations, Limited GPU support, Best for quantum optics and decoherence studies, Not ideal for large-scale gate-based simulations	Tools for modeling dissipative systems, time-dependent Hamiltonians	Physicists	`qutip`
Qibo	Easy-to-use Python interface with tensor network backends for CPU, Native GPU acceleration using Tensor Flow or CuPy, Strong focus on high-level abstractions, Supports hybrid quantum-classical workflows, Scalable with different backend	Simple API, supports CPU/GPU, and distributed computing, error mitigation	Beginners, educators	`qibo` , `qibojit`

How to Access

All simulators are provided as containers. Any system account, as applied for and instructed in Getting an account, has access to the simulator containers. Users can then choose to access the simulators either via SSH (also refer to Logging in), or on JupyterHub. Additionally, each simulator container contains some commonly used data science packages such as scipy, numpy, matplotlib, pandas, etc.

The following table contains all the quantum simulator with their respect container path. Please use it in the further instructions below.

Quantum Simulator	CONTAINER_PATH
Qibo-CPU	`/sw/container/quantum-computing/qibo-cpu/qibo-cpu.sif`
Qibo-GPU	`/sw/container/quantum-computing/qibo-gpu/qibo-gpu.sif`
Qiskit-CPU	`/sw/container/quantum-computing/qiskit-cpu/qiskit-cpu-v1.4.3.sif`
Qiskit-GPU	`/sw/container/quantum-computing/qiskit-gpu/qiskit-gpu.sif`
Qsim	`/sw/container/quantum-computing/qsim/qsim.sif`
Qulacs-CPU	`/sw/container/quantum-computing/qulacs-cpu/qulacs-cpu.sif`
Qulacs-GPU	`/sw/container/quantum-computing/qulacs-gpu/qulacs-gpu-v0.6.11.sif`
Qutip	`/sw/container/quantum-computing/qutip/qutip.sif`

Info

We do not provide a separate container just for cirq as it can be imported in Qsim container.

Info

For the Qiskit-CPU and Qulacs-GPU containers, there are also older versions available in the same directories as those mentioned here.

Terminal

The standard software to run containers on our clusters is the Apptainer software. Please refer to the Apptainer page in our documentation for instructions how to use Apptainer. Use the quantum simulators path structure to find your desired simulator. The following code is sufficient to launch the container and execute your file:

module load apptainer
apptainer exec --bind $WORK,$TMPDIR <CONTAINER_PATH>
python <YOUR_FILE_PATH>

Jupyter.HPC

Login at jupyter.hpc.gwdg.de with your AcademicCloud account.
Keep the default ‘HPC project’ and ‘job profile’, and tick the box ‘Set your own Apptainer container location’.
In the new available entry field, put the CONTAINER_PATH as defined above.
Set the rest of the parameters to your liking and press start to spawn your server.

Welcome to the GWDG Quantum Computing Channel

We are excited to introduce our new Quantum Computing Channel, where you can stay up-to-date on the latest developments and advancements in the field of quantum computing. Our channel is a platform for researchers, students, and professionals to share knowledge, ideas, and experiences related to quantum computing.

Join Our Community

We invite you to join our community and become a part of our vibrant discussion forum. Here, you can engage with our team and other members of the quantum computing community, ask questions, share your expertise, and learn from others.

Stay Informed

Our channel will feature regular updates on the latest research, breakthroughs, and innovations in quantum computing. You will have access to exclusive content, including: Research papers and publications, conference proceedings and presentations, news and updates on new projects and initiatives.

Get Involved

We encourage you to participate in our discussions and share your thoughts, ideas, and experiences. Our community is a place where you can:

Ask questions and get answers from experts in the field
Share your research and get feedback from others
Collaborate with others on projects and initiatives
Stay informed about the latest developments in quantum computing

Join Our Matrix Channel

To join our Matrix channel, simply click on the link below. We look forward to welcoming you to our quantum computing community

Further Information

For an overview of our approach to quantum computing, refer to our user group page Quantum Computing.

Please refer to the FAQ page, or for more specific questions, contact us at support@gwdg.de.

QC Simulators FAQ

Who can get the access to our quantum simulators?

To check eligibility, as well as to get an account, refer to Getting an account.

What are the possible future simulators?

With certainty, future updates of the existing simulators will be integrated regularly, and users will be notified of changes. Additional simulators could be added to the available ensemble depending on quality and demand.

What are the possible application fields for quantum computing simulators?

Many uncertainties still exist regarding useful applications of quantum computing, though research in the following areas has already received significant attention.

Quantum Chemistry: Modeling molecular structures, reaction mechanisms, and properties.
Optimization: Solving complex optimization problems in logistics, finance, and supply chains.
Cryptography: Developing and testing new cryptographic methods, especially for post-quantum cryptography.
Machine Learning: Enhancing machine learning models and algorithms through quantum speedups or novel approaches.
Material Science: Studying new materials at the atomic scale and predicting properties for innovative materials design.
Fundamental Physics: Testing hypotheses and models that are computationally demanding using classical methods.

How can I set up my own simulator environment?

Although it is possible to run quantum simulators on personal laptops, it is recommended to use the GWDG infrastructure to simulate higher number of qubits. As an HPC provider, GWDG offers significantly more compute power and storage facilities. It is also possible to consult experts in case of any doubt. Below are possible suggested steps to install your own local simulators:

We suggest creating an anaconda, miniconda or pyenv environment first.
Install the required packages with pip. A full list of the packages available/necessary for each quantum SDK is in the ‘Package dependencies’ column in the table on the quantum simulators overview page.
If you want to access a chosen SDK via jupyter notebooks, a pykernel needs to be created for the environment first.
Inside a notebook or python file, install the necessary packages and start writing circuits.
Limitations to consider: It is possible to potentially simulate up to 28 qubits on a local machine; however, the runtime is typically quite high (potentially multiple days). On a single GWDG compute node, it is possible to simulate typical 32 qubits circuits within 12 hours of runtime using just CPUs, and seconds or a few minutes with GPUs. GWDG also provides the option to run quantum simulators of GPU, provided quantum simulator has GPU support. It is possible to go beyond one compute node on GWDG infrastructure.

Do I need to know quantum mechanics to work with quantum simulators?

It depends on what you are working on and whether one is using Quantum Annealing-based quantum computers or gate-based quantum computers. At GWDG, we are focusing on gate-based quantum computers, which require less background in quantum mechanics. Additionally, due to the surge in the number of quantum simulators, most details have been abstracted and are hidden in the background. End users only need a high-level understanding of quantum mechanics and can focus on the quantum algorithmic aspects of their field. For example, in optimization, it is more important to have an understanding of algorithms like the Variational Quantum Algorithm (VQA) and the Quantum Approximate Optimization Algorithm (QAOA) than quantum mechanics itself.

How do I choose the most applicable simulator for my needs?

Consult the table on the quantum simulators overview page.

Why would I change my classical computing methods to quantum computing?

Currently, there are few quantum or hybrid applications that provide faster compute and/or more useful output than over classical applications, but the set of examples is ever growing and the barrier to entry is ever thinning, making it an increasingly attractive option for those looking to leverage the power of quantum computing.

Qibo

Qibo is a large open-source quantum ecosystem with many applications and tutorials to explore. It has a decently fast simulator to boot on CPUs and GPUs. It has a full-stack API for quantum simulation and quantum hardware control. The extra learning materials are available through the platforms for Qibo.

The following example implements a basic quantum circuit that performs a Quantum Fourier Transform (QFT) on two qubits.

Getting Started

Below is a base example for creating and running the Qibo simulator:

import numpy as np
from qibo import models, gates

# Define a quantum circuit with 2 qubits
circuit = models.Circuit(2)

# Apply Hadamard gate on the first qubit
circuit.add(gates.H(0))

# Apply controlled phase rotation gates
circuit.add(gates.CU1(0, 1, theta=np.pi / 2))

# Apply Hadamard gate on the second qubit
circuit.add(gates.H(1))

# Swap qubits 0 and 1 for the final step of QFT
circuit.add(gates.SWAP(0, 1))

# Execute the circuit on the default simulator
result = circuit()

# Print the result (wavefunction)
print("Quantum Fourier Transform Result:", result.state())

Code Insights

Import the necessary dependencies.
Define the quantum circuit with two qubits.
Hints between the lines:
- The gates.H() is creating a single qubit Hadamard gate.
- The CU1 is the controlled rotation that creates the phase shift for QFT ,the theta= is to apply the phase shift.
- The gates.SWAP() swaps the qubits to reverse their positions as part of the QFT.
Print out the circuit.

Follow up on the GWDG Updates:

Qibo
- The version of GPU supported Qibo provided by GWDG: 0.2.8v
- The version of CPU supported Qibo provided by GWDG: 0.2.13v

Key Features

User-friendly: Qibo is a quantum computing framework that prioritizes user-friendliness and strong performance. Designed for both beginners and experts, Qibo offers a simple interface for building and simulating quantum circuits, as well as support for hybrid classical-quantum algorithms.
Qibo is created with simplicity as a priority, offering a straightforward and user-friendly interface for building and running quantum circuits, thus making it easy for beginners in quantum computing.
Multiple Backend Support: Qibo offers support for various backends such as CPU, GPU, and distributed architectures to enable users to adjust the size of their simulations according to the computational resources at hand.
Variational Algorithms: The framework includes pre-built components for variational quantum algorithms such as VQE and QAOA, which are essential for solving optimization problems on quantum hardware.
Quantum Error Mitigation: With built-in tools for error mitigation, Qibo helps users simulate realistic quantum environments and develop techniques to improve the accuracy of noisy quantum computations.

Supplementary Modules

Command	Description
`on_qubits(*qubits)`	Generator of gates contained in the circuit acting on specified qubits.
`light_cone(*qubits)`	Reduces circuit to the qubits relevant for an observable.
`copy(deep: bool = False)`	Creates a copy of the current circuit as a new circuit model.
`dagger()`	Returns the dagger (conjugate transpose) of the gate.
`decompose(*free)`	Decomposes multi-control gates to gates.
`matrix(backend=None)`	Returns the matrix representation of the gate.
`generator_eigenvalue()`	This function returns the eigenvalues of the gate’s generator.
`basis_rotation()`	Transformation required to rotate the basis for measuring the gate.
`add(error, gate:)`	Add a quantum error for a specific gate and qubit to the noise model.
`eigenvalues(k= Number of eigenvalues to calculate)`	Computes the eigenvalues for the Hamiltonian.
`eigenvectors(k=Number of eigenvalues to calculate)`	Computes a tensor with the eigenvectors for the Hamiltonian.
`ground_state()`	Computes the ground state of the Hamiltonian.
`qibo.quantum_info.shannon_entropy(prob_dist, base: float = 2, backend=None)`	Calculate the Shannon entropy of a probability array.

The fundamental explanations behind of these functions,operators and states and more can be found on the official Qibo webpage.

Qiskit

The Qiskit Software Development Kit (SDK) is developed by IBM and is the most widely adopted SDK currently with many methods and tutorials. It provides a good simulator (Qiskit-aer) which can be executed on CPUs and GPUs. Qiskit is one of the highest performing quantum SDK for building and transpilation quantum circuits.

One noteworthy feature that Qiskit offers is Benchpress. This is an open-source tool for benchmarking quantum software which would help evaluate and compare the performance of different quantum algorithms. Benchmarking circuits gives us a sense of how well our hardware is actually working.

Getting Started

If you’re new to Qiskit, there are many great introductory resources available. One of the most well-known is IBM’s own Qiskit documentation. To get started with the quantum simulator that we provide you can follow Qiskit-aer.

Here, we’ll provide a basic introduction to help you get started and guide you through the initial steps for GWDG’s Qiskit simulator.

Below is a base example for creating and running the Qiskit-aer (CPU & GPU) simulator:

# Import necessary libraries from Qiskit
from qiskit import QuantumCircuit
from qiskit_aer import AerSimulator  # Import simulator from qiskit-aer
from qiskit.circuit.library import *  # Import circuit libraries (pre-built quantum gates and circuits)

# Initialize the simulator with the 'statevector' method and specify CPU as the device
simulator = AerSimulator(method='statevector', device='CPU')

# Create a quantum circuit (for example, a 2-qubit circuit)
circuit = QuantumCircuit(2)

# Apply a Hadamard gate on the first qubit (creates superposition)
circuit.h(0)

# Apply a CNOT gate on the second qubit, controlled by the first qubit (creates entanglement)
circuit.cx(0, 1)

# Add measurement to all qubits in the circuit (collapse the quantum states)
circuit.measure_all()

# Execute the circuit on the simulator with 100 shots and a fixed random seed
# No need for transpile, we use `run` method from the simulator directly
result = simulator.run(circuit, shots=100, seed_simulator=123).result()

# Print the simulation results
print(result)

# Draw the quantum circuit (matplotlib output)
circuit.draw(output='mpl')

Code Insights:

Import the necessary dependencies. Check if you have the necessary dependencies from the supplementary modules. It can vary in terms of the need and with updates.
Define the simulator and testing method. Pay attention to the environment you’re using, and switch between GPU and CPU if needed.
Build and execute the circuit. Once you’ve selected the appropriate simulator for your purpose, you can measure the circuit, which will collapse the quantum state into classical bits.
Hint between the lines:
- Execute the circuit on the simulator with shots=100 and a fixed random seed. There is no need for transpilation, we use simulator.run method from the simulator directly. The execute function has been deprecated since version 0.46.0 see notice here and it was removed in the 1.0 release.
Final step: Print and visualize the results to aid in understanding the output.

Follow up on the GWDG Updates:

Qiskit
- The version of GPU supported Qiskit provided by GWDG: 1.1.1v
- The version of only CPU supported Qiskit provided by GWDG: 1.0.2v

Key Features

Qiskit Aer provides high-performance simulators for testing and experimenting with quantum algorithms, eliminating the need for real quantum hardware. It offers various tools for simulating quantum circuits, including statevector simulation, unitary matrix extraction, noise modeling, and error mitigation.

Fundamental Circuit Execution: Qiskit Aer allows for quick and precise modeling of quantum circuits. Using the qasm_simulator, individuals are able to replicate the performance of quantum gates and receive outcomes from measurements, similar to the scenario where the circuit had operated on real quantum equipment.
Statevector Simulation: The statevector_simulator enables monitoring of the complete quantum state of your circuit at any time, offering understanding of qubits’ superposition and entanglement prior to measurement. Understanding the development of quantum algorithms at the state level is extremely valuable.
Unitary Simulation: The unitary_simulator provides a method for individuals interested in examining the entire unitary transformation implemented by a quantum circuit to obtain the circuit’s matrix representation. This is beneficial for verifying quantum gate operations and confirming that the circuit is accurately created.
Quantum Noise Simulation: Quantum noise is a crucial part of practical quantum computing. Using Qiskit Aer, you have the ability to simulate noisy quantum circuits by incorporating noise models tailored to custom or actual devices. This aids in grasping the impact of mistakes on quantum calculations and readies algorithms for real-world use.
Quantum Error Mitigation: Qiskit Aer also offers ways of reducing noise errors in quantum calculations. By utilizing methods like measurement error mitigation, you can enhance the precision of noisy simulations and achieve a closer approximation to results obtained from ideal quantum hardware.

Supplementary Modules

The following functions and libraries can be implemented in the latest version of Qiskit (v1.2.4), which can be also found on the IBM Qiskit. The ability of Qiskit is expanding day by day, and some of the important functions are as following:

Command	Description
`SparsePauliOp(Observable_labels=["IZ","XX"])`	Creates two-qubit Pauli operators (‘XX’ = X⊗X ).
`mcx_gate = MCXGate(QubitNumber)`	Imports a multi-controlled-X gate.
`two_local = TwoLocal(3, 'rx', 'cz')`	Alternates layers of single-qubit rotation gates with layers of multi entangling gates.
`feature_map = ZZFeatureMap(feature_dimension=len(features))`	Sets each number in the data as a rotation angle in a parametrized circuit.
`evolution = PauliEvolutionGate(hamiltonian, time=1)`	Simulates a quantum state evolving in time.
`CCXGate(*args[, _force_mutable])`	CCX gate, also known as Toffoli gate.
`CHGate(*args[, _force_mutable])`	Controlled-Hadamard gate.
`CPhaseGate(theta[, label, ctrl_state, ...])`	Controlled-Phase gate.
`SwapGate(*args[, _force_mutable])`	The SWAP gate.
`Diagonal(diag)`	Diagonal circuit.
`UnitaryGate(data[, label, check_input, ...])`	Class quantum gates specified by a unitary matrix.
`UCPauliRotGate(angle_list, rot_axis)`	Uniformly controlled Pauli rotations.

The Qiskit community of IBM can be the most active one in terms of the updates and also in content production, more detailed module overview can be found on Qiskit.

Qsim and Cirq

Cirq, developed by Google, can simulate circuits on a real-world qubit architecture. It also has multiple simulators included in its toolkit, e.g., Qsim. Qsim is faster than the default Cirq simulator. Cirq is a Python library for creating, modifying, and optimizing quantum circuits. It is also possible to execute your code on real quantum devices or simulators.

The additional learning materials are available on Cirq and Qsim.

Getting Started

Below is a base example for creating and running the Cirq and Qsim simulator:

import cirq 

# Create a circuit to generate a Bell State (1/sqrt(2) * ( |00⟩ + |11⟩ ))
bell_circuit = cirq.Circuit()

# Create two qubits (q0 and q1)
q0, q1 = cirq.LineQubit.range(2)

# Apply a Hadamard gate to q0 (creates superposition)
bell_circuit.append(cirq.H(q0))

# Apply a CNOT gate with q0 as control and q1 as target (creates entanglement)
bell_circuit.append(cirq.CNOT(q0, q1))

# Initialize the simulator
s = cirq.Simulator()

# Simulate the circuit (without measurement, gives the quantum state)
print('Simulating the circuit:')
results = s.simulate(bell_circuit)
print(results)

# Add a measurement gate to both qubits to observe their values
bell_circuit.append(cirq.measure(q0, q1, key='result'))

# Sample the circuit 1000 times (to observe measurement statistics)
samples = s.run(bell_circuit, repetitions=1000)

# Plot the measurement results using a histogram
import matplotlib.pyplot as plt
cirq.plot_state_histogram(samples, plt.subplot())
plt.show()

Code Insights

Import the necessary dependencies.
Define a quantum state and build the quantum circuit.
Hints between the lines:
- The s.run is sampling the circuit with respect to the repetitions, to get a sample distribution of measurements which is graphed by cirq.plot_state_histogram().

Switching to Qsim.

To run the code in Qsim, the following changes need to be done:

import qsimcirq
.
.
.

# Initialize Simulator
s = qsimcirq.QSimSimulator()

.
.
.

Follow up on the GWDG Updates:

Cirq
- The version of Cirq provided by GWDG: v1.4.1
Qsim
- The version of Qsim provided by GWDG: v0.21.0

Key Features

Cirq is an open-source framework designed for building and running quantum circuits on near-term quantum devices. Together with Qsim, Google’s advanced quantum circuit simulator, Cirq provides a comprehensive platform for developing, testing, and simulating quantum algorithms.
Circuit Design for NISQ Devices: Cirq enables users to customize circuits with particular hardware limitations and noise models.
Qsim: Qsim is a high-performance simulation tool that effectively simulates quantum circuits on classical hardware, and it is an extension of Cirq. Its efficient enhancements enable the speedy processing of extensive circuits and intricate algorithms.
Customized Noise Models: Cirq allows for the creation of customized noise models, aiding users in accurately replicating the performance of actual quantum devices.
Integration with Google`s Quantum Hardware: Cirq smoothly combines with Google’s quantum processors, enabling users to run quantum circuits on both simulated and actual hardware using the same platform.

Supplementary Modules

Command	Description
`cirq.StatevectorSimulator()`	Simulator that computes the exact evolution of a quantum system using state vectors.
`cirq.DensityMatrixSimulator`	Simulates the mixed states.
`cirq.phase_flip`	Flips the phase of a qubit.
`cirq.phase_damp`	Applies a phase damping channel to a qubit.
`cirq.amplitude_damp`	Applies an amplitude damping channel to a qubit.
`cirq.depolarize`	Applies a depolarizing noise channel to a qubit.
`cirq.asymmetric_depolarize`	Applies an asymmetric depolarizing noise channel to a qubit.
`cirq.reset`	Resets a qubit to its ground state.
`cirq.Gate`	Represents a quantum gate.
`cirq.Operation`	Represents a quantum operation on a qubit.
`cirq.Moment`	Represents a moment in a quantum circuit.
`cirq.ParamResolver`	Resolves parameters in a quantum circuit.

There are many options for different applications that can be found easily on Cirq documentation such as GPU/CPU based simulations, noisy simulation, exact simulation, parameter sweeps, state histograms.

Qulacs

Qulacs is a very fast simulator across the board for large, noisy or parametric quantum circuits. It can be executed on both CPUs and GPUs, and can be optimized for faster execution. However, as of now, we only provide Qulacs support for CPU execution. Installation of Qulacs for different devices, as well as various tutorials and API documents, can be found via Github repository.

Quantum Circuit Learning (QCL) is a hybrid algorithm that leverages quantum computers to enhance machine learning capabilities [1]. It’s designed to run on Noisy Intermediate-Scale Quantum (NISQ) Computers, which are medium-scale quantum devices that don’t have error correction capabilities. This algorithm combines quantum and classical computing to achieve efficient machine learning results.

Getting Started

Below is a base example for creating and running the Qulacs (CPU & GPU) simulator:

# Import necessary libraries from Qulacs, you have one dependencies
from qulacs import QuantumCircuit, QuantumState
from qulacs.gate import X, H, CNOT, SWAP

# Initialize quantum state and circuit with 4 qubits
nqubits = 4
st = QuantumState(nqubits)          # Create a quantum state with 4 qubits
circuit = QuantumCircuit(nqubits)   # Create a quantum circuit for 4 qubits

# Add gates to the circuit
circuit.add_gate(X(0))              # Apply Pauli-X gate (NOT) on qubit 0
circuit.add_gate(H(2))              # Apply Hadamard gate on qubit 2 (superposition)
circuit.add_gate(SWAP(0, 1))        # Swap the states of qubits 0 and 1

# Define control and target qubits for CNOT gate
control, target = 3, 2
circuit.add_gate(CNOT(control, target))  # Apply CNOT gate with qubit 3 as control and 2 as target

# Add a random unitary gate on a subset of qubits
list_qubits = [1, 2, 3]
circuit.add_random_unitary_gate(list_qubits)  # Apply a random unitary gate on qubits 1, 2, and 3

# Update the quantum state using the circuit
circuit.update_quantum_state(st)    # Apply all gates to update the quantum state

# Calculate the probability of measuring qubit 1 in the zero state
prob_zero = st.get_zero_probability(1)  # Get probability of qubit 1 being in state |0>

# Calculate the marginal probability distribution for the specified qubits
prob_marginal = st.get_marginal_probability([1, 2, 2, 0])  # Get marginal probability for qubits 1, 2, and 0

# Draw the circuit diagram (requires 'matplotlib' library)
circuit.draw(output='mpl')          # Visualize the quantum circuit

Code Insights

Import the necessary dependencies.
Be aware of the environment. Pay attention to the environment you’re using, and switch between GPU and CPU if needed.
Define a quantum state and build the quantum circuit.
- Adjust the quantum circuit gates that want to be added into the circuit.
- For random entangling of arbitrary many chosen qubits, add; add_random_unitary_gate
update_quantum_state is to simulate the evolution of a quantum state under the operations defined in the quantum circuit.
get_zero_probability is to get the probability that the specified qubit will be measured as 0.
get_marginal_probability is to get the 0th qubit is 1 and the 3rd qubit is 0. The qubits and the state of the qubits may change with respect to the problem.

Follow up on the GWDG Updates:

Qulacs
- The version of CPU supported Qulacs provided by GWDG: 0.6.10

Key Features

Qulacs is a quantum simulator optimized for high performance that can be used on classical as well as quantum computers. Renowned for its quickness and adaptability, Qulacs enables the simulation of extensive quantum circuits, making it a favored option for researchers and industry experts alike. Its ability to efficiently utilize memory and computing power allows for quick simulation of quantum gates and algorithms, even on devices with restricted resources.

Tailored for speed: Qulacs is created to enhance speed on traditional devices, making it one of the speediest quantum simulators around. Efficient simulation of large-scale quantum circuits is made possible by its optimized and parallelized functions.
Flexibility in Gate Operations: Qulacs allows users to try out advanced quantum algorithms by offering a range of quantum gates, both standard and custom.
Hybrid Classical-Quantum Simulations: Qulacs enables hybrid quantum-classical algorithms like the Variational Quantum Eigensolver (VQE) and Quantum Approximate Optimization Algorithm (QAOA), important for near-term quantum computing applications.
GPU Acceleration: Utilizing the power of GPUs, Qulacs speeds up simulation performance, allowing for the running of more intricate circuits in a shorter period of time.

Supplementary Modules

The usage of Qulacs can be seen rather easier since it has one core library. To see the various functions from Qulacs, visit the Qulacs tutorials.

Command	Description
`QuantumState(nQubits).get_vector`	Returns all the element as an array.
`QuantumState(nQubits).get_amplitude`	Get a single element.
`QuantumState(nQubits).allocate_buffer`	Allocate a state vector of the same size as the quantum state you already have and without copying the state.
`QuantumState(nQubits).set_Haar_random_state()`	Generate a random Haar-distributed quantum state.
`QuantumState(nQubits).set_zero_state()`	Initialize a quantum state.
`QuantumState(nQubits).set_computational_basis(0b101)`	Initialize the specified value to the calculation base in binary notation.
`QuantumState(nQubits).permutate_qubit()`	Swap indices of a qubit.
`QuantumState(nQubits).drop_qubit()`	Get a projection onto a specified qubit.
`QuantumState(nQubits).partial_trace()`	Obtain the partial trace of a given qubit of a given quantum state as a density matrix.
`state = QuantumStateGpu(n)`	Calculate using GPU.
`value = inner_product(state_bra, state_ket)`	Calculate inner product.
`tensor_product_state = tensor_product(state_ket1, state_ket2)`	Calculate tensor product.

References

[1] K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii, “Quantum circuit learning”, Phys. Rev. A 98, 032309 (2018), arXiv：https://arxiv.org/abs/1803.00745

QuTip

QuTip is designed with a focus on physics applications, and thus has a massive library of methods for that goal. The extra learning materials are available through the platforms for QuTip.

Getting Started

Below is a base example for creating and running the QuTip simulator which will create a Bell state (an entangled state of two qubits):

import numpy as np
from qutip import *

# Define the initial state (|00> state for two qubits)
initial_state = tensor(basis(2, 0), basis(2, 0))

# Define the Hadamard gate and CNOT gate
hadamard_gate = snot()  # Single qubit Hadamard gate
cnot_gate = cnot()      # CNOT gate

# Apply Hadamard gate to the first qubit
state_after_hadamard = tensor(hadamard_gate, qeye(2)) * initial_state

# Apply the CNOT gate to create entanglement
final_state = cnot_gate * state_after_hadamard

# Print the final state (Bell state)
print(final_state)

# Optionally, check the final state's density matrix
rho = final_state * final_state.dag()
print(rho)

Code Insights

Import the necessary dependencies.
Define a quantum state and build the quantum circuit.
Hints between the lines:
- The snot() is creating a single qubit Hadamard gate to create a superposition.
- The cnot() is the CNOT gate to entangle the qubits, resulting in a Bell state.
- The final_state.dag() gives the conjugate transpose and the qeye() is the identity operator.
Print out the results. The density matrix of the final state is to analyze the entanglement and see the mixed states.

Follow up on the GWDG Updates:

Qutip
- The version of Qutip provided by GWDG: 5.0.4

Key Features

QuTiP is a framework built in Python that is intended for the simulation and analysis of open quantum systems. Extensively utilized in quantum optics, quantum thermodynamics, and quantum control research, it provides a wide range of tools for simulating dissipative and decoherent quantum phenomena.

Open Quantum System Simulation: QuTiP is highly effective in simulating open quantum systems, which involve interactions with environments causing decoherence and dissipation, essential for studying practical quantum systems.
Quantum Control: QuTiP offers strong resources for creating and mimicking quantum control plans, which is perfect for creating methods to manage qubits with great accuracy.
Quantum Optics and Dynamics: QuTiP is widely utilized in quantum optics due to its native support for typical quantum systems like cavities, harmonic oscillators, and two-level atoms.
Hamiltonian Engineering: The Hamiltonian Engineering framework enables users to define and simulate time-dependent and custom Hamiltonians, allowing in-depth exploration of dynamic behaviors in quantum systems.

Supplementary Modules

Command	Description
`coherent(N,alpha)`	Coherent state, alpha = complex number (eigenvalue) for requested coherent state.
`maximally_mixed_dm(N)`	Maximally mixed density matrix, N = number of levels in Hilbert space.
`zero_ket(N)`	Empty ket vector, N = number of levels in Hilbert space.
`basis(N,#m)`,`fock(N,#m)`	Fock state ket vector, N = number of levels in Hilbert space, m = level containing excitation (0 if no m given).
`momentum(qobj)`	Momentum operator, qobj = Object to copy dimensions from.
`position(qobj)`	Position operator, qobj = Object to copy dimensions from.
`qeye(N)`	Identity, N = number of levels in Hilbert space.
`destroy(qobj)`	Lowering (destruction) operator, qobj = Object to copy dimensions from.
`create(qobj)`	Raising (creation) operator, qobj = Object to copy dimensions from.
`squeezing(q1, q2, sp)`	Squeezing operator (Generalized), q1,q2 = Quantum operators (Qobj) sp = squeezing parameter.
`Q.check_herm()`	Check Hermicity, check if quantum object is Hermitian.
`Q.dag()`	Dagger (adjoint), returns adjoint (dagger) of object.
`Q.diag()`	Returns the diagonal elements.
`Q.eigenenergies()`	Eigenenergies (values) of operator.
`Q.eigenstates()`	Returns eigenvalues and eigenvectors.
`Q.groundstate()`	Eigenval & eigket of Qobj groundstate.
`Q.inv()`	Matrix inverse of the Qobj.
`Q.norm()`	Returns L2 norm for states, trace norm for operators.
`Q.ptrace(sel)`	Partial trace returning components selected using ‘sel’ parameter.
`Q.proj()`	Form projector operator from given ket or bra vector.
`Q.transform(inpt)`	A basis transformation defined by matrix or list of kets ‘inpt’.

The fundemental explanations behind of these functions,operators and states and more can be found on the official QuTip webpage.

RStudio-JupyterHub

We offer the possibility of running RStudio instances on the interactive partitions of our HPC clusters, through the JupyterHub (also known as JupyterHPC) container platform. The advantages of this approach include more flexible resource allocation, access to your usual files in your home folder, and the possibility of rapidly creating new RStudio containers tailored to your specific requirements.

For calculations that need compute resources over a long duration, please submit a batch job to the appropriate Slurm partition instead of using the RStudio instance.

Starting your RStudio-JupyterHub instance

Project portal, SCC and NHR users: https://jupyter.hpc.gwdg.de

Go to https://jupyter.hpc.gwdg.de and log in with your usual account.
“Start my server”, if the button appears.
Select the appropriate entry from the “HPC Project (Username)” dropdown. Select the Project Portal username (u12345) corresponding to your project.
Select the “Jupyter” and then “RStudio” cards, and “CPU” as the HPC Device.
Under Advanced, set a reasonable amount of resources to use, the defaults might be way too high for simple R jobs! Reasonable defaults for simple jobs are: 1-2 CPUs, 1-2 GBs of RAM/Memory, max. 8 hours runtime. If you know that you will need it, you can request more CPUs or RAM/Memory.

In the end, your configuration should look similar to:

These containers start up as jobs in an interactive cluster partition, and so will expire after the time given in the initial options (as will any R jobs that you have left running). The maximum allowed time is currently 8 hours. If your jobs require longer running times, please let us know.

Click on start server, spawning might take 1-2 minutes (if the server fails to start, it will time out on its own! Don’t refresh the page!)
Once your server starts, you will be directly logged into an RStudio session.
If you experience problems starting your server, please provide any errors shown by your browser, as well as the contents of ~/current.jupyterhub.notebook.log (don’t start another notebook or it will overwrite this file!). A common cause of failure to spawn is running out of disk space quota, so please first check that you still have space for new files!

Stopping your server/notebook/RStudio instance

In RStudio, press the red button up and to the right. (Other equivalent buttons under File and Session do not work for the time being)
This should return you to the Jupyter GUI. Click on Stop My Server, you might need to wait a minute or two.

Of course you can also just let your session expire after the previously given runtime.

Accessing data from your legacy account

Data from the old RStudio instance went directly to your legacy user’s home folder. This home folder is not directly available from the new RStudio instance which uses queues on the Emmy and Grete islands (see HOME folder documentation for more information on this). There are two options here:

Option 1: If you do not have too many files or too large files, the best option is to copy over data from your legacy account to your project account. See the Data Migration Guide for details on how to do this. In short, you will have to:

Check that your legacy user and your new project user share a group (which they should if everything is set up correctly).
Use chgrp and chmod on the folders of your legacy account so they can be accessed by other members of your group (in this case, your project account)
Move or copy the folders from the legacy home to the project account folder.
This needs to be done while logged in to the SCC cluster (where both your legacy home and your project home are available ) via SSH.

Option 2: If you have too many files to transfer everything to the quota of your project user, you can work by going through SCRATCH:

On your legacy account, on the SCC:
- Copy your relevant data to /scratch-scc/users/USERNAME. Data in scratch does not count against your disk quota.
- Please take into account that scratch should be used for non-permanent data only. Only use scratch to store data that you can easily recreate. You should always keep a copy of final results and input files that are difficult to reconstruct in your home folders.
- Make the relevant changes to the permissions of your scratch folder, as explained in Data Migration Guide and Option 1.
On your RStudio instance:
- You can now access the data at /scratch-scc/users/USERNAME.
- Tell R to process and load your data from scratch, and output your results to your project user home folder.
- If you know how to work with soft/symbolic links, you can create one to the scratch location for convenience.
- Once again, only use scratch for temporary files and data! Anything that is difficult to reconstruct should live in your home folders!

Warning

Access to the data in the home folders of legacy accounts will be slowly phased out as time goes on, and become more difficult to access. Dedicated transfer nodes will be provided, but will make accessing your old data more complicated. Please consider Options 1 and 2 as temporary solutions to help you in the transition. You should fully migrate your work to a project user as soon as possible!

Installing packages

The RStudio containers already contain a large number of the more commonly requested R packages. If you need other packages, install them the usual way from inside the RStudio instance in the container. They will be installed to your home folder, and be available whenever you restart the container.

Newer R version

If you require a newer R version for your RStudio instance due to some specific packages, let us know and we can build an updated container for you.

Retrieving your old RStudio packages

NOTE: This will end up installing A LOT of packages, since it will also reinstall any packages that might be slightly newer than the ones already available in the container. I recommend picking only those libraries you actually work with instead of this brute-force approach.

On your old or personal RStudio instance, go to the R tab and:

ip <- installed.packages()[,1]
write(ip,"rpackages_in_4.2.0.txt")

Copy the created file to the cluster corresponding to your account. In the new RStudio instance now do:

ip <- readLines("rpackages_in_4.2.0.txt")
install.packages(ip)

Some packages might have been installed through the R package BiocManager, in which case:

ip <- readLines("rpackages_in_4.2.0.txt")
BiocManager::install(ip)

More information on JupyterHub and Containers

Creating containers for JupyterHub, with a couple of example container definition files.

Using apptainer (to create and test new containers). Notice you need to run apptainer from inside a Slurm job! Ideally use an interactive job in an interactive queue for this purpose.

Advanced: Testing the RStudio container & using it for Slurm jobs

If you want to test the environment of the RStudio container without the burden/extra environment of Jupyter and RStudio, you can run the container directly. You can also use this approach to start up and use the container in batch (that is, non-interactive) mode.

Start an interactive (or batch) job.
Load the apptainer module.
apptainer run container.sif will “log into” the container.

You can also build your own container from the examples in the JupyterHub page, or following this recipe/.def file used for the RStudio containers (might be out of date! No guarantees it will work correctly). Recipe might take about an hour to build and the resulting container file will be a couple of GBs large:

CLICK ME for a large container definition file

Bootstrap: docker
#From: condaforge/miniforge3
From: ubuntu:jammy
%post
    export DEBIAN_FRONTEND=noninteractive
    apt update
    apt upgrade -y
    # Install Julia
    # Not available in 22.04 repos
    # apt install -y julia
    # echo 'ENV["HTTP_PROXY"] = "http://www-cache.gwdg.de:3128"' >> /etc/julia/startup.jl
    # echo 'ENV["HTTPS_PROXY"] = "http://www-cache.gwdg.de:3128"' >> /etc/julia/startup.jl

    ##################
    # R and packages #
    ##################
    apt install -y --no-install-recommends software-properties-common dirmngr
    apt install -y wget curl libcurl4-openssl-dev git-all
    wget -qO- https://cloud.r-project.org/bin/linux/ubuntu/marutter_pubkey.asc | tee -a /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc
    add-apt-repository "deb https://cloud.r-project.org/bin/linux/ubuntu $(lsb_release -cs)-cran40/"
    #add-apt-repository ppa:c2d4u.team/c2d4u4.0+
    apt update

    apt install -y \
        r-base \
        r-base-dev \
        r-cran-caret \
        r-cran-crayon \
        r-cran-devtools \
        r-cran-forecast \
        r-cran-hexbin \
        r-cran-htmltools \
        r-cran-htmlwidgets \
        r-cran-plyr \
        r-cran-randomforest \
        r-cran-rcurl \
        r-cran-reshape2 \
        r-cran-rmarkdown \
        r-cran-rodbc \
        r-cran-rsqlite \
        r-cran-shiny \
        r-cran-tidyverse \
        r-cran-rcpp \
        libfftw3-3 libfftw3-dev libgdal-dev

    apt install -y \
        r-bioc-annotationdbi r-cran-bh r-bioc-biobase r-bioc-biocfilecache r-bioc-biocgenerics r-bioc-biocio \
        r-cran-biocmanager r-cran-biocmanager r-bioc-biocneighbors r-bioc-biocparallel r-bioc-biocsingular r-bioc-biocversion \
        r-bioc-biostrings r-cran-cairo r-bioc-complexheatmap r-cran-dbi r-cran-ddrtree r-bioc-deseq2 \
        r-bioc-delayedarray r-bioc-delayedmatrixstats r-cran-fnn r-cran-formula r-bioc-geoquery r-bioc-go.db \
        r-bioc-gosemsim r-bioc-genomeinfodb r-bioc-genomeinfodbdata r-bioc-genomicalignments r-bioc-genomicfeatures r-bioc-genomicranges \
        r-cran-getoptlong r-cran-globaloptions r-bioc-hdf5array r-bioc-hsmmsinglecell r-cran-hmisc r-bioc-iranges \
        r-bioc-keggrest r-cran-kernsmooth r-cran-mass r-cran-matrix r-bioc-matrixgenerics r-cran-nmf \
        r-cran-r.cache r-cran-r.methodss3 r-cran-r.oo r-cran-r.utils r-cran-r6 r-cran-rann \
        r-bioc-rbgl r-cran-rcolorbrewer r-cran-rcurl r-cran-rmysql r-cran-rocr r-cran-rsqlite \
        r-cran-rspectra r-cran-runit r-cran-rcpp r-cran-rcppannoy r-cran-rcpparmadillo r-cran-rcppeigen \
        r-cran-rcpphnsw r-cran-rcppparallel r-cran-rcppprogress r-cran-rcpptoml r-bioc-residualmatrix r-bioc-rhdf5lib \
        r-bioc-rhtslib r-bioc-rsamtools r-cran-rserve r-cran-rtsne r-bioc-s4vectors r-bioc-scaledmatrix \
        r-cran-seurat r-cran-seuratobject r-bioc-singlecellexperiment r-cran-sparsem r-cran-stanheaders r-bioc-summarizedexperiment \
        r-cran-v8 r-cran-vgam r-cran-venndiagram r-cran-xml r-bioc-xvector r-cran-abind \
        r-cran-acepack r-bioc-affxparser r-bioc-affy r-bioc-affyio r-bioc-annotate r-cran-ape \
        r-cran-askpass r-cran-assertthat r-cran-backports r-cran-base64enc r-bioc-beachmat r-cran-beeswarm \
        r-cran-bibtex r-cran-bindr r-cran-bindrcpp r-bioc-biocviews r-bioc-biomart r-cran-bit \
        r-cran-bit64 r-cran-bitops r-cran-blob r-bioc-bluster r-cran-boot r-cran-brew \
        r-cran-brio r-cran-broom r-cran-catools r-cran-cachem r-cran-callr r-cran-car \
        r-cran-cardata r-cran-cellranger r-cran-checkmate r-cran-circlize r-cran-class r-cran-classint \
        r-cran-cli r-cran-clipr r-cran-clue r-cran-cluster r-cran-coda r-cran-codetools \
        r-cran-colorspace r-cran-combinat r-cran-commonmark r-cran-corpcor r-cran-corrplot r-cran-covr \
        r-cran-cowplot r-cran-cpp11 r-cran-crayon r-cran-credentials r-cran-crosstalk r-cran-curl \
        r-cran-data.table r-cran-dbplyr r-cran-deldir r-cran-desc r-cran-devtools r-cran-dichromat \
        r-cran-diffobj r-cran-digest r-cran-doparallel r-cran-dorng r-cran-docopt r-cran-downlit \
        r-cran-downloader r-cran-dplyr r-cran-dqrng r-cran-dtplyr r-bioc-edger r-cran-ellipse \
        r-cran-ellipsis r-cran-evaluate r-cran-expm r-cran-fansi r-cran-farver r-cran-fastica \
        r-cran-fastmap r-cran-fastmatch r-cran-ff r-cran-fitdistrplus r-cran-flexmix r-cran-forcats \
        r-cran-foreach r-cran-foreign r-cran-formatr r-cran-fs r-cran-furrr r-cran-futile.logger \
        r-cran-futile.options r-cran-future r-cran-future.apply r-cran-gargle r-cran-gdata r-bioc-genefilter \
        r-bioc-geneplotter r-cran-generics r-cran-gert r-cran-getopt r-cran-ggalluvial r-cran-ggbeeswarm \
        r-cran-ggforce r-cran-ggplot2 r-cran-ggpubr r-cran-ggraph r-cran-ggrepel r-cran-ggridges \
        r-cran-ggsci r-cran-ggsignif r-cran-gh r-cran-gitcreds r-bioc-glmgampoi r-cran-globals \
        r-cran-glue r-cran-goftest r-cran-googledrive r-cran-googlesheets4 r-cran-gplots r-bioc-graph \
        r-cran-graphlayouts r-cran-gridbase r-cran-gridextra r-cran-gridgraphics r-cran-gtable r-cran-gtools \
        r-cran-haven r-cran-hdf5r r-cran-here r-cran-highr r-cran-hms r-cran-htmltable \
        r-cran-htmltools r-cran-htmlwidgets r-cran-httpuv r-cran-httr r-cran-ica r-cran-ids \
        r-cran-igraph r-cran-ini r-cran-inline r-cran-irlba r-cran-isoband r-cran-iterators \
        r-cran-itertools r-cran-jpeg r-cran-jquerylib r-cran-jsonlite r-cran-knitr r-cran-labeling \
        r-cran-lambda.r r-cran-later r-cran-lattice r-cran-latticeextra r-cran-lazyeval r-cran-leiden \
        r-cran-lifecycle r-bioc-limma r-cran-listenv r-cran-lme4 r-cran-lmtest r-cran-locfit \
        r-cran-loo r-cran-lubridate r-cran-magrittr r-bioc-makecdfenv r-cran-markdown r-cran-matrixstats \
        r-cran-mclust r-cran-memoise r-bioc-metapod r-cran-mgcv r-cran-mime r-cran-miniui \
        r-cran-minqa r-cran-mnormt r-cran-modelr r-cran-modeltools r-bioc-monocle r-bioc-multtest \
        r-cran-munsell r-cran-network r-cran-nleqslv r-cran-nlme r-cran-nloptr r-cran-nnet \
        r-cran-numderiv r-bioc-oligo r-bioc-oligoclasses r-cran-openssl r-cran-pander r-cran-parallelly \
        r-cran-patchwork r-cran-pbapply r-cran-pbkrtest r-cran-pbmcapply r-bioc-pcamethods r-cran-pheatmap \
        r-cran-pillar r-cran-pkgbuild r-cran-pkgconfig r-cran-pkgload r-cran-pkgmaker r-cran-plogr \
        r-cran-plotly r-cran-plyr r-cran-png r-cran-polyclip r-cran-polynom r-cran-praise \
        r-bioc-preprocesscore r-cran-prettyunits r-cran-processx r-cran-progress r-cran-progressr r-cran-promises \
        r-cran-proto r-cran-proxy r-cran-ps r-cran-pscl r-cran-psych r-cran-purrr \
        r-cran-qlcmatrix r-cran-quadprog r-cran-quantreg r-bioc-qvalue r-cran-ragg r-cran-randomforest \
        r-cran-rappdirs r-cran-raster r-cran-rcmdcheck r-cran-readr r-cran-readxl r-cran-registry \
        r-cran-rematch r-cran-rematch2 r-cran-remotes r-cran-reprex r-cran-reshape r-cran-reshape2 \
        r-cran-restfulr r-cran-reticulate r-cran-rex r-bioc-rhdf5 r-bioc-rhdf5filters r-cran-rjags \
        r-cran-rjson r-cran-rlang r-cran-rmarkdown r-cran-rngtools r-cran-roxygen2 r-cran-rpart \
        r-cran-rprojroot r-cran-rsample r-cran-rstan r-cran-rstatix r-cran-rstudioapi r-cran-rsvd \
        r-bioc-rtracklayer r-cran-rversions r-cran-rvest r-cran-s2 r-cran-sandwich r-cran-sass \
        r-cran-scales r-bioc-scater r-cran-scattermore r-bioc-scran r-cran-sctransform r-bioc-scuttle \
        r-cran-selectr r-cran-sessioninfo r-cran-sf r-cran-sfsmisc r-cran-shape r-cran-shiny \
        r-cran-sitmo r-cran-slam r-cran-slider r-cran-sna r-cran-snow r-cran-sourcetools \
        r-cran-sp r-cran-spdata r-bioc-sparsematrixstats r-cran-sparsesvd r-cran-spatial r-cran-spatstat.data \
        r-cran-spatstat.geom r-cran-spatstat.random r-cran-spatstat.sparse r-cran-spatstat.utils r-cran-spdep r-cran-statmod \
        r-cran-statnet.common r-cran-stringi r-cran-stringr r-cran-survival r-bioc-sva r-cran-svglite \
        r-cran-sys r-cran-systemfonts r-cran-tensor r-cran-terra r-cran-testthat r-cran-textshaping \
        r-cran-tibble r-cran-tidygraph r-cran-tidyr r-cran-tidyselect r-cran-tidyverse r-cran-timedate \
        r-cran-timeseries r-cran-tinytex r-cran-tweenr r-cran-tzdb r-cran-udunits2 r-cran-units \
        r-cran-usethis r-cran-utf8 r-cran-uuid r-cran-uwot r-cran-vctrs r-cran-vipor \
        r-cran-viridis r-cran-viridislite r-cran-vroom r-cran-waldo r-cran-warp r-cran-webshot \
        r-cran-whisker r-cran-withr r-cran-wk r-cran-xfun r-cran-xml2 r-cran-xopen \
        r-cran-xtable r-cran-yaml r-cran-zeallot r-cran-zip r-bioc-zlibbioc r-cran-zoo

    # rcpp: solves some issues with -Wformat errors when installing various packages under R4.4, the package manager version of RCpp is not new enough
    echo ""
    echo "#######################################"
    echo "# Starting installation of R packages #"
    echo "#######################################"
    echo ""
    echo 'install.packages("Rcpp")' >> packages.R
    echo 'BiocManager::install(version = "3.19", update=FALSE, ask=FALSE)' >> packages.R
    echo 'ip <-c("eseis","CellChat","ClusterProfiler","RCppML","SeuratData","SeuratDisk","SeuratWrappers","fgsea")' >> packages.R
    echo 'install.packages(ip, Ncpus=4)' >> packages.R
    echo 'BiocManager::install(ip, update=FALSE, ask=FALSE)' >> packages.R

    # Run installation and divert output to dev/null, there is A LOT of output
    # Comment this out if you are just testing stuff cos it is going to take a while
    Rscript packages.R 2>&1 >/dev/null
    rm packages.R

    echo ""
    echo "########################################"
    echo "# Done with installation of R packages #"
    echo "########################################"
    echo ""


    ###########
    # RStudio #
    ###########

    apt install -y libclang-dev lsb-release psmisc sudo libssl-dev
    ubuntu_release=$(lsb_release --codename --short)
    wget https://download2.rstudio.org/server/${ubuntu_release}/amd64/rstudio-server-2023.12.1-402-amd64.deb
    dpkg --install rstudio-server-2023.12.1-402-amd64.deb
    rm rstudio-server-2023.12.1-402-amd64.deb
    echo 'ftp_proxy=http://www-cache.gwdg.de:3128' >> /usr/lib/R/etc/Renviron.site
    echo 'https_proxy=http://www-cache.gwdg.de:3128' >> /usr/lib/R/etc/Renviron.site
    echo 'http_proxy=http://www-cache.gwdg.de:3128' >> /usr/lib/R/etc/Renviron.site
    echo '' >> /usr/lib/R/etc/Renviron.site

    # Other stuff
    apt install -y vim
    apt install -y default-jre # required for gipptools

    #########
    # Conda #
    #########
    # Among other stuff, installs Jupyter env.

    # Install miniconda to /miniconda
    condash="Miniconda3-py310_24.5.0-0-Linux-x86_64.sh"
    curl -LO "http://repo.continuum.io/miniconda/${condash}"
    bash ${condash} -p /opt/conda -b
    rm ${condash}
    PATH=/opt/conda/bin:${PATH}
    conda update -y conda
    conda init

    conda install --quiet --yes -c conda-forge \
        'ipyparallel' \
        'jupyter-rsession-proxy' \
        'notebook' \
        'jupyterhub==2.3.1' \
        'jupyterlab'

    conda install --quiet --yes -c conda-forge \
        dgl \
        igraph \
        keras \
        pandas \
        pydot \
        scikit-learn \
        scipy \
        seaborn


%environment
    # required so JupyterHub can find jupyterhub-singleuser
    export PATH=$PATH:/opt/conda/bin

Troubleshooting & FAQ

Please adjust your utilized resources to reasonable numbers, since you will be sharing the interactive partition nodes with others (1-2 CPUs, 1-2 GBs of RAM/Memory, max. 8 hours runtime).
Your usual home folder files should be accessible from the container.
You can install packages as usual with install.packages, or if you think it will be a popular package, request a centralized installation.
If you are experiencing strange issues, check you do not have leftover configuration files from other RStudio instances, e.g. ~/.R/Makevars or an old .RData file, in your home folder.
External modules (module load) are NOT accessible.

Known Issues

$HOME might not be set up correctly in the Terminal tab (it is correct from the R tab in RStudio), so you might want to change it if some scripts of yours depend on this. This on RStudio’s Terminal tab might fix it:

export HOME=/usr/users/$USER

You can also ignore any LC_whatever error messages related to locale configuration.
Function help with F1 might show a “Firefox can’t open embedded page” error.

SAIA

SAIA is the Scalable Artificial Intelligence (AI) Accelerator that hosts our AI services. Such services include Chat AI and CoCo AI, with more to be added soon. SAIA API (application programming interface) keys can be requested and used to access the services from within your code. API keys are not necessary to use the Chat AI web interface.

API Request

If a user has an API key, they can use the available models from within their terminal or python scripts. To get access to an API key, go to the KISSKI LLM Service page and click on “Book”. There you will find a form to fill out with your credentials and intentions with the API key. Please use the same email address as is assigned to your AcademicCloud account. Once received, DO NOT share your API key with other users!

API Usage

The API service is compatible with the OpenAI API standard. We provide the following endpoints:

/chat/completions
/completions
/embeddings
/models
/documents

API Minimal Example

You can use your API key to access Chat AI directly from your terminal. Here is an example of how to do text completion with the API.

curl -i -X POST \
  --url https://chat-ai.academiccloud.de/v1/chat/completions \
  --header 'Accept: application/json' \
  --header 'Authorization: Bearer <api_key>' \
  --header 'Content-Type: application/json'\
  --data '{                    
  "model": "meta-llama-3.1-8b-instruct",
  "messages":[{"role":"system","content":"You are an assistant."},{"role":"user","content":"What is the weather today?"}],
  "max_tokens": 7,
  "temperature": 0.5,
  "top_p": 0.5
}'

Ensure to replace <api_key> with your own API key.

API Model Names

For more information on the respective models see the model list.

Model Name	Capabilities
meta-llama-3.1-8b-instruct	text
openai-gpt-oss-120b	text
meta-llama-3.1-8b-rag	text, arcana
llama-3.1-sauerkrautlm-70b-instruct	text, arcana
llama-3.3-70b-instruct	text
gemma-3-27b-it	text, image
medgemma-27b-it	text, image
teuken-7b-instruct-research	text
mistral-large-instruct	text
qwen3-32b	text
qwen3-235b-a22b	reasoning
qwen2.5-coder-32b-instruct	text, code
codestral-22b	text, code
internvl2.5-8b	text, image
qwen2.5-vl-72b-instruct	text, image
qwq-32b	reasoning
deepseek-r1	reasoning
e5-mistral-7b-instruct	embeddings
multilingual-e5-large-instruct	embeddings
qwen3-embedding-4b	embeddings

A complete up-to-date list of available models can be retrieved via the following command:

  curl -X POST \           
  --url https://chat-ai.academiccloud.de/v1/models \
  --header 'Accept: application/json' \
  --header 'Authorization: Bearer <api_key>' \
  --header 'Content-Type: application/json'

API Usage Examples

The OpenAI (external) models are not generally available for API usage. For configuring your own requests in greater detail, such as setting the frequency_penalty,seed,max_tokens and more, refer to the openai API reference page.

Chat

It is possible to import an entire conversation into your command. This conversation can be from a previous session with the same model or another, or between you and a friend/colleague if you would like to ask them more questions (just be sure to update your system prompt to say “You are a friend/colleague trying to explain something you said that was confusing”).

curl -i -N -X POST \
  --url https://chat-ai.academiccloud.de/v1/chat/completions \
  --header 'Accept: application/json' \
  --header 'Authorization: Bearer <api_key>' \
  --header 'Content-Type: application/json'\
  --data '{                     
  "model": "meta-llama-3.1-8b-instruct",
  "messages": [{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"How tall is the Eiffel tower?"},{"role":"assistant","content":"The Eiffel Tower stands at a height of 324 meters (1,063 feet) above ground level. However, if you include the radio antenna on top, the total height is 330 meters (1,083 feet)."},{"role":"user","content":"Are there restaurants?"}],
  "temperature": 0
}'

For ease of usage, you can access the Chat AI models by executing a Python file, for example, by pasting the below code into the file.

from openai import OpenAI
  
# API configuration
api_key = '<api_key>' # Replace with your API key
base_url = "https://chat-ai.academiccloud.de/v1"
model = "meta-llama-3.1-8b-instruct" # Choose any available model
  
# Start OpenAI client
client = OpenAI(
    api_key = api_key,
    base_url = base_url
)
  
# Get response
chat_completion = client.chat.completions.create(
        messages=[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"How tall is the Eiffel tower?"},{"role":"assistant","content":"The Eiffel Tower stands at a height of 324 meters (1,063 feet) above ground level. However, if you include the radio antenna on top, the total height is 330 meters (1,083 feet)."},{"role":"user","content":"Are there restaurants?"}],
        model= model,
    )
  
# Print full response as JSON
print(chat_completion) # You can extract the response text from the JSON object

In certain cases, a long response can be expected from the model, which may take long with the above method, since the entire response gets generated first and then printed to the screen. Streaming could be used instead to retrieve the response proactively as it is being generated.

from openai import OpenAI
 
# API configuration
api_key = '<api_key>' # Replace with your API key
base_url = "https://chat-ai.academiccloud.de/v1"
model = "meta-llama-3.1-8b-instruct" # Choose any available model
 
# Start OpenAI client
client = OpenAI(
    api_key = api_key,
    base_url = base_url
)
 
# Get stream
stream = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Name the capital city of each country on earth, and describe its main attraction",
        }
    ],
    model = model ,
    stream = True
)
 
# Print out the response
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

If you use Visual Studio Code or Jetbrains as your IDE, the recommended way to maximise your API key ease of usage, particularly for code completion, is to install the Continue plugin and set the configurations accordingly. Refer to CoCo AI for further details.

Azure API

Some of our customers may come in contact with the Azure OpenAI API. This API is compatible with the OpenAI API, barring minor differences in the JSON responses and the endpoint handling. The official OpenAI Python client offers an AzureOpenAI client to account for these differences. In SAIA, as the external, non open-weight models are obtained from Microsoft Azure, we created a translation layer to ensure OpenAI compatibility of the Azure models.

List of known differences:

Addition of content_filter_results in the responses of Azure models.

Image

The API specification is compatible with the OpenAI Image API. However, fetching images from the web is not supported and must be uploaded as part of the requests.

See the following minimal example in Python.

import base64
from openai import OpenAI

# API configuration
api_key = '<api_key>' # Replace with your API key
base_url = "https://chat-ai.academiccloud.de/v1"
model = "internvl2.5-8b" # Choose any available model

# Start OpenAI client
client = OpenAI(
    api_key = api_key,
    base_url = base_url,
)

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

# Path to your image
image_path = "test-image.png"

# Getting the base64 string
base64_image = encode_image(image_path)

response = client.chat.completions.create(
  model = model,
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is in this image?",
        },
        {
          "type": "image_url",
          "image_url": {
            "url":  f"data:image/jpeg;base64,{base64_image}"
          },
        },
      ],
    }
  ],
)
print(response.choices[0])

Text to Image

curl -i -N -X POST \
  --url https://chat-ai.academiccloud.de/v1/images/generations \
  --header 'Accept: application/json' \
  --header 'Authorization: Bearer <key>' \
  --header 'Content-Type: application/json' \
  --data '{
    "prompt": "flower",
    "response_format": "b64_json",
    "model": "flux",
    "size": "1024x1024",
    "n": 1,
    "quality": "standard"
}'

Replace with the key provided by GWDG. This curl command uses Flux.1-schnell as its backend model.

Image to Image

 curl https://chat-ai.academiccloud.de/v1/edit-image/ \
 -H "Authorization: Bearer <key>" \
 -H "Content-Type: multipart/form-data" \
 -H "inference-service: image-edit" \
 -F "prompt=make style to Van-Gogh" \
 -F "image=@./<img.png or jpg>" \
 -o "edited_output.png"

Replace with the key provided by GWDG, and with your image. This curl command uses Qwen-Image-Edit as its backend model.

Voice to Text

 curl -i 'https://saia.gwdg.de/v1/audio/<translations or transcriptions>' \
  --header 'Accept: /' \
  --header 'Authorization: Bearer <key>' \
  -H "Content-Type: multipart/form-data"\
  -F model="whisper-large-v2" \
  -F "file=@./<voice.wav, mp4, mp3 or flac>" \
  -F response_format=<vtt or text or srt>

Replace with the key provided by GWDG, choose between transcriptions or translations, srt or vtt or text, and your audio file. This curl command uses whisper-large-v2 as its backend model.

Embeddings

Embeddings are only available via the API and support the same API as the OpenAI Embeddings API.

See the following minimal example.

curl https://chat-ai.academiccloud.de/v1/embeddings \
  -H "Authorization: Bearer <api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "The food was delicious and the waiter...",
    "model": "e5-mistral-7b-instruct",
    "encoding_format": "float"
  }'

See the following code example for developing RAG applications with llamaindex: https://gitlab-ce.gwdg.de/hpc-team-public/chat-ai-llamaindex-examples

RAG/Arcanas

Arcanas are also accessible via the API interface. A minimal example using curl is this one:

curl -i -X POST \
  --url https://chat-ai.academiccloud.de/v1/chat/completions \
  --header 'Accept: application/json' \
  --header 'Authorization: Bearer <api_key>' \
  --header 'Content-Type: application/json'\
  --data '{                    
  "model": "meta-llama-3.1-8b-rag",
  "messages":[{"role":"system","content":"You are an assistant."},{"role":"user","content":"What is the weather today?"}],
  "arcana" : {
      "id": "<the Arcana ID>"
      },
  "temperature": 0.0,
  "top_p": 0.05
}'

Docling

SAIA provides Docling as a service via the API interface on this endpoint:

https://chat-ai.academiccloud.de/v1/documents

A minimal example using curl is:

curl -X POST "https://chat-ai.academiccloud.de/v1/documents/convert" \
    -H "accept: application/json" \
    -H 'Authorization: Bearer <api_key>' \
    -H "Content-Type: multipart/form-data" \
    -F "document=@/path/to/your/file.pdf"

The result is a JSON response like:

{
  "response_type": "MARKDOWN",
  "filename": "example_document",
  "images": [
    {
      "type": "picture",
      "filename": "image1.png",
      "image": "data:image/png;base64, xxxxxxx..."
    },
    {
      "type": "table",
      "filename": "table1.png",
      "image": "data:image/png;base64, xxxxxxx..."
    }
  ],
  "markdown": "#Your Markdown File",
}

To extract only the “markdown” field from the response, you can use the jq tool in the command line (can be installed with sudo apt install jq). You can also store the output in a file by simply appending > <output-file-name> to the command.

Here is an example to convert a PDF file to markdown and store it in output.md:

curl -X POST "https://chat-ai.academiccloud.de/v1/documents/convert" \
    -H "accept: application/json" \
    -H 'Authorization: Bearer <api_key>' \
    -H "Content-Type: multipart/form-data" \
    -F "document=@/path/to/your/file.pdf" \
    | jq -r '.markdown' \
    > output.md

You can use advanced settings in your request by adding query parameters:

Parameter	Values	Description
`response_type`	`markdown`, `html`, `json` or `tokens`	The output file type
`extract_tables_as_images`	`true` or `false`	Whether tables should be returned as images
`image_resolution_scale`	`1`, `2`, `3`, `4`	Scaling factor for image resolution

For example, in order to extract tables as images, scale image resolution by 4, and convert to HTML, you can call:

https://chat-ai.academiccloud.de/v1/documents/convert?response_type=json&extract_tables_as_images=false&image_resolution_scale=4

which will result in an output like:

{
  "response_type": "HTML",
  "filename": "example_document",
  "images": [
    ...
  ],
  "html": "#Your HTML data",
}

API Limits

You can check your current API usage limits and remaining quota directly from the HTTP response headers. Run the following command (replace <your-API-Key> with your actual key, and adjust the endpoint as needed):

curl -i -H "Authorization: Bearer <your-API-Key>" \
     -H "Content-Type: application/json" \
     https://saia.gwdg.de/v1/chat/completions

The response includes both the body and the rate-limit headers. Example output:

HTTP/2 400 
content-type: application/json
content-length: 43
x-ratelimit-limit-minute: 1000
x-ratelimit-limit-hour: 10000
x-ratelimit-limit-day: 50002
x-ratelimit-remaining-minute: 999
x-ratelimit-remaining-hour: 9999
x-ratelimit-remaining-day: 50001
ratelimit-limit: 1000
ratelimit-remaining: 999
ratelimit-reset: 1
date: Mon, 20 Oct 2025 10:13:58 GMT
server: uvicorn
via: kong/3.6.1

Interpreting the Headers

X-RateLimit-Limit-*: maximum requests allowed per time window (minute, hour, day).
X-RateLimit-Remaining-*: how many requests are still available before hitting the limit.
ratelimit-reset: time (in seconds) until the counter resets.

Developer reference

The GitHub repositories SAIA-Hub, SAIA-HPC and of Chat AI provide all the components for the architecture in the diagram above.

Further services

If you have more questions, feel free to contact us at support@gwdg.de.

Secure HPC

What is Secure HPC?

Secure HPC is an HPC tool designed to provide robust security for sensitive computational non-interactive batch jobs on shared HPC systems. It addresses the need for secure data processing in environments where traditional HPC systems, optimized for performance, lack adequate security measures to protect highly sensitive data. Secure HPC ensures data integrity and confidentiality, allowing researchers to transfer, store, and analyze sensitive information safely.

Diagram of a High Performance Computing (HPC) system. A secure client connects via SSH to a frontend server, which then communicates with SLURM for job scheduling, secured compute nodes, and a parallel file system for data storage. A VAULT icon indicates security measures are in place.

Secure HPC integrates several components, and a few of them are highlighted in the above image. Vault serves as a Key Management System (KMS) to securely store and share encryption keys between the HPC system and the secure client. The HPC frontend nodes are used by the secure client to submit encrypted batch jobs. Slurm, the resource manager, schedules and manages the encrypted jobs across the secured compute nodes. These compute nodes are isolated to prevent unauthorized access and ensure data integrity during processing. Additionally, the parallel file system offers high-speed storage while maintaining data encryption. The secure client is assumed to be a secure Linux machine and is used to interact with the Secure HPC environment, ensuring that data remains protected from the user’s side as well. Note that this excludes most public cloud machines, as they are not fit to handle unencrypted sensitive data. Together, these components create a secure environment that safeguards sensitive data throughout the computation lifecycle.

Purpose and Importance of Secure HPC

Secure HPC is designed to facilitate the secure processing of sensitive data on HPC systems. Its primary goals are to ensure data integrity, confidentiality, and compliance with data protection regulations such as GDPR while maintaining the high computational performance needed for scientific research.

Key Features

End-to-End Data Encryption:
- Data is encrypted both at rest and in transit using advanced encryption techniques.
- Encrypted LUKS containers and Singularity/Apptainer containers ensure secure storage and computation.
Role-Based Access Control:
- Secure HPC employs strict access control mechanisms to limit data access to authorized users only.
- Role-based permissions help manage user access efficiently.
Secure Container Execution:
- Utilizes Singularity/Apptainer containers to isolate computational tasks, reducing the risk of unauthorized data access.
- Containers are encrypted and securely executed on HPC nodes.
Key Management System (Vault):
- Integrated with Vault for secure key management, ensuring that encryption keys are stored and managed securely.
- Automated key retrieval and use during job execution.
Zero-Trust Architecture:
- Each secure node has to authenticate with its own private key before being able to get the sensitive data.
- The secure client signs each job, ensuring that only authorized users are allowed to submit jobs.
- Since the batch script is fully encrypted, the content of the scripts cannot be seen from within SLURM.
Automated Workflow:
- Secure workflow for data transfer, storage, and processing, minimizing manual intervention and reducing the risk of human error.
- Detailed step-by-step execution processes to guide users through secure job submission and data retrieval.

Benefits

Enhanced Security:
- Protects sensitive data from unauthorized access and breaches, ensuring data integrity and confidentiality.
- Reduces the attack surface by minimizing the need for direct interaction with the host operating system.
Regulatory Compliance:
- Helps researchers comply with data protection regulations, enabling them to securely process and share sensitive data.
- Ensures that sensitive data is handled according to legal and institutional requirements.
Maintained Performance:
- Provides robust security without compromising the high performance of HPC systems.
- Efficient handling of encrypted data and containers to maintain computational speed and resource utilization.
User Trust and Collaboration:
- Builds trust among researchers by ensuring that their data is secure, fostering collaboration and data sharing.
- Encourages the use of HPC resources for sensitive data processing across different scientific domains.
Future-Proof Security:
- Adaptable to new security threats and vulnerabilities, providing a future-proof solution for secure data processing.
- Continuous updates and improvements to keep up with evolving cybersecurity challenges.

More Information

Secure HPC is based on A Secure Workflow for Shared HPC Systems (PDF), which details many of the underlying rationales and the details of the key exchange mechanisms.

To understand whether Secure HPC fits your use case, see the Use Cases & Requirements Section.
For a practical guide on how to set up a new Linux client to be able to submit Secure HPC jobs, see the Getting Started Guide.
Lastly, to see what a successful Secure HPC project could look like, see the Success Stories

Use Cases & Requirements

Requirements

This section is meant to guide you on whether Secure HPC fits your current use case. In particular, Secure HPC is most useful in the following cases:

Large computations, possibly requiring GPUs: HPC nodes have a lot more computational resources than a classical workstation. Additionally, many of our compute nodes have Nvidia GPUs to support more parallel computation, such as training deep learning models. Secure HPC is a good fit if the computation cannot be done on a local workstation, or it would take too long.
Long computation jobs: The nodes themselves can provide a lot of computational resources once the job is running. But each secure computation (i.e. each job submission) has a lot of overhead: First, the submission has to re-encrypt the data container and submission script, requiring new KMS one-time tokens. Then, those have to be transferred into the data system. Afterwards, the job has to be scheduled. Since it is a sequential system, it has to possibly wait in a queue for previous jobs to complete. While all of this is automated, it still adds a somewhat constant overhead to your computation, no matter the computation length (excluding job scheduling rules). Thus, the longer the computation, the higher the performance measured in wall-clock time. If the computations can run without manual intervention in-between, it is highly recommended to join them into bigger batch jobs; either through simple shell scripts or, if required, more sophisticated workflow management tools such as Snakemake.
Non-interactive computation: Due to the overhead described in the previous bullet point, batch jobs in general
Scriptable tools: Since batch jobs are not interactive, Graphical User Interfaces (GUIs) cannot be used while the computation is running.
Jobs requiring highly secure environments: If it is (conceptually and legally) sufficient to store data unencrypted, using the classical HPC requires a more frictionless experience. Users can still protect their data by setting restrictive permissions, even without using Secure HPC. If you need support setting up your permission system, or with other related data management questions, feel free to contact us at hpc-support@gwdg.de.

Additionally, at the organizational level, each Secure HPC use case requires an associated HPC Project.

Use Cases that fit Secure HPC

Use cases that would fit Secure HPC well

Deep Learning on any type of medical data
Computer Vision on MRI/CT scans
…

Use Cases that do not fit Secure HPC

Use Cases that require a graphical user interface: Currently not supported for sensitive data; work in progress. For non-sensitive data, our JupyterHub can be used.
Highly Interactive Use Cases: For example, typical exploratory data science workflows.
Computations on public or non-sensitive data: Here, normal HPC with correctly set permissions would usually suffice.

Roadmap for new users

Check if your use case is compatible with SecureHPC (see above).
Contact us via hpc-support@gwdg.de to get in touch.
Setting up a test environment in collaboration with us, to ensure that there are no technical complications with your workflow and SecureHPC (using only non-sensitive test data).
Users have to provide/setup legal documents such as Auftragsverarbeitungsvertrag (AVV), Verzeichnis der Verarbeitungstätigkeiten (VVT), Betriebskonzept (BK) in order to comply with their GDPR (DSGVO) standards.
Users have to setup the secure client in their trusted environment and allow the client to connect to our HPC’s front end nodes.

Unclear?

In case it is still unclear, feel free to reach out to us directly at hpc-support@gwdg.de.

To get a feeling for how your use case could be ported to Secure HPC, see one of our official examples or our Github repo containing the client-side code.

To adapt your use case and prepare your client for Secure HPC (which can already be done while awaiting access), see the Getting Started Guide.

Getting Started

Assumption

Our Secure HPC aims to assume as little trust as possible, in line with modern zero-trust architectures. Nonetheless, by definition, two systems have to be trusted:

The HPC System’s Image Server: We assume that the image server, which is part of the HPC system, is secure. It is located in a highly secure area of the HPC system, protected by multiple layers of security, and accessible only to a few essential services and administrators. This secure location helps us trust that the image server is safe from unauthorized access.
The User’s Personal System (Secure Client): We also assume that your personal system, such as a laptop or workstation, is secure. This is crucial because your data begins its journey on your local system before being sent to Secure HPC.

Warning

It is important to understand that the secure client should be highly trusted by you. If your local system is not secure, your data could be compromised before it even reaches the secure workflow of Secure HPC. This is why we emphasize the term secure client—it signifies that your local system must be safeguarded with utmost care to ensure the overall security of your data.

These assumptions are essential because they ensure that the entire process, from start to finish, is secure. Trust in the system comes from knowing that both the initial and final stages of the process are protected.

Prerequisites

Access to HPC System: Access to the HPC system is required. If you don’t have an account, please refer to our getting an account page.
Linux-based Client to upload: As our client software currently only runs on Linux, a Linux-based operating system (such as Ubuntu, Debian, Fedora) is required.
Initial experience with job Submission with Slurm: Users should be familiar with job submission processes and using the Slurm workload manager. Please refer to our Slurm documentation for more details.

Application

In order to use Secure HPC, our admins have to provide you with:

Secure HPC Allocation: In order to provide a fully isolated node, each Secure HPC job gets its own exclusive node allocation.
Access to our HashiCorp Vault server: We will provide you with a token to authenticate against our Vault Key Management Server (KMS). This requirement is fulfilled when you have already contacted us and we have provided you with a token. The token has to be placed in a specific directory named secret. See Installation of required Software step in Installation Guidelines.

Still, you can prepare for Secure HPC even before you have access to the machines. There are two prerequisites:

Port your current scripts and data to an Apptainer-based workflow
Install the SecureHPC client on your secure Linux device

If you are still unsure whether Secure HPC fits your use case, see the Use Cases & Requirements section in our docs.

1. Port your current scripts and data to an Apptainer-based workflow

For this, Apptainer has to be installed on your local device. For installation instructions, visit the Apptainer installation guide.

As a basis, you can use this Apptainer definition, which is based on an Ubuntu Docker image (container.def):

Bootstrap: docker
From: ubuntu

%post
    export DEBIAN_FRONTEND=noninteractive
    # Install any packages you need here
    apt-get update && apt-get install -y --no-install-recommends \
        curl \
        wget \
        ca-certificates \
        tree \
        python3 \
        pkg-config

    # Any optional setups, custom libaries etc
    echo "Additional stuff... pip, Rscript..."

%runscript
    # This will be ignored, as we overwrite it later in the invocation
    exec echo "Hello World"

This can then be built from the command line via

apptainer build container.sif container.def

The usual use case is that the container is quite stateless (containing mostly the software packages needed), and that the data (and possibly scripts) are provided via a later encrypted bind-mount.

Next, let’s assume you put your Python script and your data into your ~/mydata folder, including the mycomputation.py script. You can then mount the data into the previously built container, and set the run command as follows:

# Description of all parameters:
# - `--nv` for nvidia support, in case you have a local graphics card
#
# - `--bind ~/mydata:/output/` says that the folder ~/mydata on the host system
#   should be available as /output in the container, as a writable mount
#
# - `./container.sif` is the previously built container, based on the
#   `./container.def` description
#
# - `bash -c '...'` overwrites the `%runscript` in the `./container.def`
#   with your custom command, allowing for a more dynamic iteration cycle

apptainer exec \
    --nv \
    --bind ~/mydata:/output/ \
    ./container.sif \
    bash -c 'python3 mycomputation.py > /output/computation.log 2>&1'

Note that, since mycomputation.py is run within the container, it has to load the data from the /output prefix.

Once that is working, porting it to Secure HPC is a fast and easy process.

2. Install the Secure HPC client on your secure Linux device

Next, the Secure HPC client can already be installed on your device, although it won’t be able to fully submit jobs without the authentication keys provided by us.

Here are the following steps:

Installation of required Software:
- Git: Version control system for managing code. For installation instructions, visit the Git installation guide.
- Apptainer (formerly Singularity): Container platform for running applications in isolated environments. For installation instructions, visit the Apptainer installation guide.
- Hashicorp Vault: Follow the instructions from the official website
- Cryptsetup: Installation
  - On Debian based OS (Ubuntu, Mint, etc):
```
sudo apt-get update
sudo apt-get install cryptsetup
```
  - On RHEL based OS (Rocky Linux, Fedora, etc)
```
sudo dnf update
sudo dnf install cryptsetup
```

GPG is available by default on every Linux-based OS

Clone the Secure HPC Git Repository: Open a terminal and clone the secure HPC git repo on the secure client home directory with the following command:
```
git clone https://github.com/gwdg/secure-hpc.git
```
If possible, adapt the template as much as possible

In case you already containerized your job as described above, replace the container.def content with your apptainer recipe.
In the command.sh.template, replace the singularity bash call with your own call
Include your own data into the data directory (which, by default, will be mounted in as /output into the container.
In the secure_sbatch, replace the USER, LOGIN_NODE, and EXEC_DIR with your preferences. Note that the EXEC_DIR will contain SLURM logs, thus this should not be publicly accessible.

Create your local GPG key:
- Generate GPG Key Pair:
```
# Copy code
gpg --full-generate-key
```
  Follow the prompts to create your key pair.
- Upload Public Key to Vault: Use the instructions provided by your HPC administrator to upload your public key to Vault.

After that, once you’ve got your keys, you’ll be able to start using Secure HPC.

Debug Vault KMS Access

Note: This assumes that you already got your Vault Key.

Tip

Follow these steps if you want to verify that your token is valid:

Set the Vault Server Address: export VAULT_ADDR='https://kms.hpc.gwdg.de:443'
Set the token: export VAULT_TOKEN=$(cat secret/<local_uid>.token)
Check Token Lookup: vault token lookup
- If the token is valid, you will see output with details about the token (like its policies, expiration, etc.).

If the token is invalid, you’ll see an error message. Please report it to the administrator so we can fix it.

Success stories

Automated analysis of MRI images

Scenario: A medical research facility aims to implement an automated analysis system for MRI brain images, which inherently contain identifiable biometrics. Due to the sensitive nature of the data, processing these images must adhere to stringent data protection regulations. Additionally, the considerable computational power required dictates that the analysis is performed in an external data center.

Challenges:

Extracting quantifiable and interpretable information from MRI image data within 10 minutes post-examination to support timely clinical decisions.
Ensuring the security of highly sensitive personal data while utilizing external computational resources.
Coordinating a complex workflow among multiple systems, including MRI scanners, servers, and PACS (Picture Archiving and Communication System), all while complying with the DICOM (Digital Imaging and Communications in Medicine) standard to safeguard patient privacy.

How Secure HPC Solves This:

Data Encryption and Access Control: MRI images are routed to a secure intranet node for pseudonymization and pre-processing. Following this, they are analyzed on a secure HPC partition at GWDG, where access is strictly controlled to protect sensitive information.
End-to-End Data Protection: The secure HPC framework ensures that third-party access to sensitive personal data is prevented throughout the entire workflow, meeting high data protection requirements crucial for handling medical information.
Rapid Processing with Security: The FastSurferCNN AI model is used for the segmentation of the MRI images. By processing only the necessary layers, the model minimizes memory usage while generating reliable insights that can assist medical professionals without compromising data security.
Robust System Architecture: The system is designed to ensure seamless interaction among MRI scanners, analysis servers, and PACS under the DICOM standard, fostering a secure operational environment that prioritizes patient confidentiality.

Impact: The project has been well received by both staff and patients, highlighting its potential to transform medical imaging workflows while ensuring the highest standards of patient confidentiality. Through this implementation, the facility successfully conducts automated analysis of MRI images while maintaining the utmost security for sensitive patient data, thus supporting clinical operations effectively and ethically.

Software Performance Engineering

At GWDG we aim to provide performance engineering services for our customers to help them achieve their goals with high level accuracy and efficiency. The performance engineering services will improve customers computing experiences and help them integrate AI solutions into their HPC workloads.

Goals

Improve efficiency and productivity in our HPC Systems
Improved customer experience
Increased competitiveness

Key Components

Testing and Measurements
- Use of software development and analysis tools to test and measure key performance metrics
Root cause analysis
- Use performance tools to analyze performance data and identify possible root causes of performance issues
Modeling
- Characterization of hardware and software features
- Predictive analytics: Apply machine learning algorithms to analyze performance data in predicting possible performance degradation caused by changes in the systems
Monitoring
- Real-time performance monitoring: Use dashboards to identify bottlenecks in utilization of HPC resources
Optimization
- Use performance tools to identify bottlenecks
- Use performance models to identify optimization options
- Analyse performance data to explore the optimization space.
- Use AI-powered tools to optimize resource utilization and power consumption

Performance of LLM Inference

We aim to investigate the overall scalability and memory utilization of LLM inference in GWDG systems and offer support for our customers on performance best practices.

AI powered Performance Engineering (Research options)

We aim to integrate AI technologies and Machine Learning techniques in our services to improve the performance and efficiency of our software systems

Develop research ideas in collaboration with customers and partner institutions
Horovod+ Ray

Domains covered

All Scientific Domains
- Physics, Chemistry, Mathematics
Applied AI
- ML, DNN and LLM services
High Performance Data Analytics

Tools

ScoreP
VAmpir
NSIGHT-Systems
Horovod+ Ray
etc

Use cases

Using ScoreP to instrument NNI - AutoML
Characterization of CPU utilization (memory, network, etc) - WLCG
Monitoring Energy Consumption of GROMACS in Emmy

Performance Engineering Competition

We plan student competitions and hackathons on performance engineering on a regular basis.

Contact

GWDG Academy courses
Write a ticket to hpc-support mentioning PE in the subject

Voice-AI

One of KISSKI’s standout offerings is its AI-based transcription and captioning service, Voice-AI. Utilizing High-Performance Computing (HPC) infrastructure, Voice-AI leverages the Whisper (large-v2) to transcribe audio and generate video captions swiftly. Trained on 680,000 hours of labeled data, Whisper rivals professional human transcribers in performance, offering reliable automatic speech recognition (ASR) and speech translation across various datasets and domains. Users can choose between tasks such as transcription and translation to suit their needs, and notably, this KISSKI service will be available for free.

Tip

You need an Academic Cloud account to access the AI Services. Use the federated login or create a new account. Details are on this page.

Service Components

This service is composed of two main parts:

Handling Uploaded Audio: Processes audio files uploaded by users (<500 MB).
Handling Streaming Audio: Captures and processes streaming audio from browser (This part will be available in the future).

Audio File Transcription/Translation Service:

If you have an AcademicCloud account, the web interface can also easily be reached here.

The platform offers intuitive, built-in features designed for seamless audio processing:

Input language: Choose the language of the uploaded audio for transcription.
Text format: Choose the format of the output, which can be text, SRT, or VTT.
Choose file: Upload your audio file, which can be wav, mp3, or mp4.
Delete Output: Instantly and permanently remove the transcription result.
Light/Dark mode (sun/moon button): Toggle between light and dark mode.
Footer: Includes “Privacy Policy”, “Terms of use”, “FAQ”, Contact, and the option to switch between English and German.

Core Capabilities

English audio transcription with attached timestamps (depending on the chosen output format).
Non-English audio transcription, supporting multiple languages, including German.
Audio translation from various languages to English.

How to Use the Service

Choose the input audio language.
Select the output format (Normal text/SRT/VTT).
Upload audio files in various formats such as mp3, mp4, flac, or wav.
Choose between transcription or translation action.
The output can be downloaded when the transcription is ready.

Streaming Audio Transcription Service (This part will be available in the future):

This browser-based tool provides real-time transcription or English translation during meetings and lectures, enhancing clarity, accessibility, and engagement—especially in noisy or multilingual environments. It supports deaf and hard-of-hearing participants, language learners, and anyone needing better note-taking or content review. No installation is required, and it works in any browser. Transcriptions and summaries are written to a shared Etherpad Lite URL generated at session start, enabling collaborative editing and review. Etherpad Lite is an open-source editor that allows multiple users to work on the same document simultaneously, making communication more inclusive and efficient.

(Future service)

Intuitive, built-in features includes:

Start Session: Begins transcription and generates a pad URL.
Stop Session: Ends the current session.
Mode: Choose between transcription or translation to English.
Spoken Language: Defaults to auto-detect, or manually select a language.
Subtitle Overlay: Opens a detached window to display subtitles over any webpage.
Finalize & Summarize: Generates a summary directly in the pad.
Light/Dark Mode: Toggle between light and dark themes (sun/moon icon).
Footer: Includes links to Privacy Policy, Terms of Use, Imprint, FAQ, Help, Contact, and language switch (English/German).

Ensuring Privacy and Flexibility

We prioritize security to ensure reliability, regulatory compliance, and business continuity. User privacy is central to our design. Audio and conversation inputs are deleted immediately after transcription or translation.

Exceptions:

Audio transcription results: stored on our data mover node, erased after 30 days. Voice AI outputs can also be deleted instantly and permenantly via a dedicated delete button.
Live transcription results (future feature): stored in MySQL, auto-deleted after 24 hours.
Usage Logging: We record request counts and timestamps per user for system monitoring and accounting.

Acknowledgement

Jakob Hördt for writing the proxy. Marcel Hellkamp for writing the bbb audio captioning code. Ali Doost Hosseini for Kong gateway. Johannes Biermann for technical support.

Author

Narges Lux

Further services

If you have questions, please browse the FAQ first. If you have more specific questions, feel free to contact us at support@gwdg.de.

Voice-AI FAQ

Data Privacy

Are my conversations or usage data used for AI training or similar purposes?

No, your conversations and data are not used to train any AI models.

Are my audio files and conversations stored on your servers at any stage?

No, user audio files or BBB conversation are not stored at any stage on our servers. Once your audio is sent and you receive the response, the result is available in your browser. The results of the transcription/translation are saved on our data mover node which will be erased after 30 days. This gives the users the opportunity to download their results whithin 30 days. BBB transcriptions, which are written in Etherpad and use local MySQL to save data also will be kept for 30 days before they are erased (future service).

What data does Voice AI keep when I use the service?

We do not keep any conversations or audio files on our servers. Transcrition/translation results are stored in our data mover node for 30 days. BBB transcriptions, which are stored in local MySQL also will be erased after 30 days (Future service). We also record some usage statistics to monitor the load on our service and improve the user experience. This includes usernames, timestamps, and the services that were requested.

Converting Larger Audio Files to FLAC Format

If your file is too large, you can use tools like FFmpeg or Audacity to convert it to FLAC, which is a lossless format. For example, using FFmpeg:

ffmpeg -i input.wav -vn -acodec flac output.flac

Availability

My institution is interested in using Voice AI. Can we advertise it to our users? Would you be able to handle an additional load for XXX users?

For large institutions, please contact us directly at info@kisski.de.

Are Voice AI services for free?

Voice AI services that are accessible to a user with an AcademicCloud account are for free.

Data Privacy Notice

The following English translation of the “Datenschutzerklärung” is for information purposes only. Only the German version is legally binding.

Responsible for data processing

The responsible party within the meaning of the General Data Protection Regulation and other national data protection laws of the member states as well as other legal data protection provisions is:

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Burckhardtweg 4
37077 Göttingen
Germany
Tel: +49 (0) 551 39-30001
E-Mail: support@gwdg.de
Website: www.gwdg.de

Contact person / Data protection officer

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Datenschutzbeauftragter
Burckhardtweg 4
37077 Göttingen
Germany
E-Mail: support@gwdg.de

General information on data processing

Scope of the processing of personal data

We only process personal data of our users insofar as this is necessary to provide a functional website and our content and services.

As a matter of principle, we only process personal data of our users to the extent that this is necessary to provide a functional website and our content and services. The processing of personal data of our users is regularly carried out only with the consent of the user. An exception applies in those cases in which obtaining prior consent is not possible for actual reasons and the processing of the data is permitted by legal regulations.

Legal basis for the processing of personal data

Insofar as the processing of personal data is necessary for compliance with a legal obligation to which our company is subject, Article 6 (1) lit. (c ) GDPR is the legal basis.

Where the processing of personal data is necessary in order to protect the vital interests of the data subject or another natural person, the legal basis is Article 6 (1) lit. (d) GDPR.

Data protection notice on the use of Voice AI services

Description and scope of data processing

Each time our website is accessed, our system automatically collects data and information from the computer system of the accessing computer.

In order to use the Voice AI services hosted by the GWDG, user input/requests are collected from the website and processed on the HPC resources. Protecting the privacy of user requests is of fundamental importance to us. For this reason, our service does not store your audio file or BBB conversation, nor are requests or responses stored on a permanent memory at any time. The only exception is the BBB transcription, which is written in Etherpad and uses local MySQL, and >500 MB audio transcription results in our GWDG storage, both of which will be deleted after 30 days. The number of requests per user and the respective time stamps are recorded so that we can monitor the use of the system and perform billing. The following data is stored to fulfill the service:

Date of access
Name of the operating system installed on the accessing device
Name of the browser used
Source system via which the access was made
The IP address of the accessing device
The data is also stored in the log files of our system. This data is not stored together with other personal data of the user.

Data processing when creating accounts

The following data, in addition to the above, is stored when an account is created:

E-mail address
Name and first name
Mobile phone number (if provided)
Date and time of the times of registration and confirmation

The following data can optionally be provided by you after the account has been created:

Additional e-mail address(es)
Salutation and title
Date of birth
Additional telephone number(s)
Postal address(es)
Security-specific settings (security questions and answers; two-factor authentication)

Date of access
Purpose or action on the website (e.g. changing/re-setting passwords; failed log-on attempts etc.)
Name of the operating system installed on the accessing device
Name of the used browser
Source system via which the access was made
The IP address of the accessing device, with the last two bytes masked before the first storage (example: 192.168.xxx.xxx). The abbreviated IP address cannot be associated with the accessing computer.
An estimate of the location of the accessing client based on the IP address

Legal basis for data processing

The legal basis for temporary storage of data and the log files is Article 6 (1) lit. (f) GDPR.

Purpose of the data processing

The recording of user input via our website and the processing of user input on our HPC system is necessary in order to be able to generate a response using the selected Voice AI service. The storage in log files is done to ensure the functionality of the website. In addition, the data helps us to optimize the website and to ensure the security of our IT systems. An evaluation of the data for marketing purposes does not take place in this context. These purposes also constitute our legitimate interest in data processing pursuant to Article 6 (1) lit. (f) GDPR.

Retention period

The input is only stored on the GWDG server during the inference process itself. However, the output data in txt, vtt, or srt format remains stored on the GWDG server for one month, giving users time to download their results within that month. In addition, a log is kept which contains the number of requests per user and the respective time stamps. The logs are stored in accordance with GWDG guidelines.

Objection and elimination option

The collection of data for the provision of the website and the storage of the data in log files is absolutely necessary for the operation of the website. Consequently, there is no possibility for the user to object.

Rights of data subjects

The supervisory authority for the processing of personal data conducted by GWDG is the following:

Landesbeauftragte für den Datenschutz Niedersachsen
Postfach 221, 30002 Hannover
E-Mail: poststelle@lfd.niedersachsen

Terms of Use (Recommendation)

AGBs | Terms and Conditions for Voice-AI

1. Introduction

1.1 Welcome to our Voice-AI Service (the “Service”). By using the Service, you agree to comply with and be bound by the following terms and conditions (“Terms”). The Terms govern the business relationship between the platform operator (hereinafter referred to as the ‘Platform Operator’) and the users (hereinafter referred to as the ‘Users’) of VoiceAI.

1.2 Platform Operator is the GWDG; Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen.

1.3 The Terms apply as of in the version valid at the time of registration of Users. The Platform Operator might update those Terms anytime in the future.

1.4 Please read these Terms carefully before using the Service. By using the Service you aknowledge that you have read and agreed on these Terms.

2. Service Description

2.1 The Service refers to the Application Programming Interface (VoiceAI API) provided by the Platform Operator (GWDG) to signed-in Users of AcademicCloud as set out following.

2.2 The Service provides Users with the ability to:

Translate Voice Streams: Utilize AI technology to translate the voice streams into text in various languages.
Take Down Notes from Meeting: Automatically generate written notes from user’s voice input from online meetings using AI technology.

2.3 Service Specification: The Service is composed of two main parts:

Handling Uploaded Audio: This part processes audio files uploaded by users.
Handling Streaming Audio from BBB: This part captures and processes streaming audio from BigBlueButton (BBB).

We provide a web server via the GWDG ESX service that hosts a WebUI. Users can log in via SSO and interact with the system through a user interface. The web frontend connects to a Chat AI Kong, which then connects to an HPC proxy and subsequently to a functional account via SSH. Kong is responsible for managing these connections.

Kong Gateway is a cloud-native API gateway that acts as a reverse proxy to manage, configure, and route requests to APIs. It is designed for hybrid and multi-cloud environments, optimized for microservices and distributed architectures. Kong provides the necessary flexibility and scalability to handle the API traffic efficiently.

The functional account can only execute a single command, “cloud_interface.sh”, which is a bash script that uses curl to send a REST request to a compute node.

For audio streaming, the bot joins the room as a listener, notifies BBB participants of the recording, captures audio via WebRTC, and sends the audio frames through the aforementioned procedure. A proxy handles WebSocket (WS) connections necessary for streaming audio.

WebSocket is used because it provides a persistent, two-way communication channel over a single TCP connection, which is essential for real-time audio streaming.

An Uvicorn server with the model runs on the respective computing node to calculate the result. This HTTP connection only takes place within the isolated HPC network. The server on the compute node responds with a stream back to the web service running in the cloud via the active SSH connection. The response is written in Etherpad Lite. Etherpad Lite is an open-source, real-time collaborative editor that allows multiple users to edit documents simultaneously in their browsers.

Info

For billing purposes, logs are written on the HPC login node, recording the username, timestamp, and inference ID at the beginning and end of each request. Similarly, on the cloud/ESX server, the username, email address, and timestamp for each request are logged.

If many requests are received, several computing nodes are required to send the responses in time. An automatic scaling mechanism has been implemented to determine the number of simultaneous requests within the last 5 seconds and adjust the number of compute nodes accordingly. If necessary, new jobs are sent to Slurm. This is monitored by scheduler.py, which is executed during the SSH-Keep Alive.

3. User Responsibilities

3.1 By using the Service, you agree to:

provide accurate and lawful voice inputs for processing by the Service.
ensure that your use of the Service does not violate any applicable laws, regulations, or rights of others.
ensure that the use is pursuant to these Terms, in conjunction to the Terms of Use of Academic Cloud, where your specific rights and obligations are stipulated. The latter can be found here: AcademicCloud | Terms of Use.
not misuse the Service in any way that could harm, disable, overburden, or impair the Service.

4. AI Output

4.1 By using the Service you are aware that the output is AI-generated. The Service uses advanced Artificial Intelligence (AI) technology to translate voice inputs and generate written notes.

4.2 However, the AI-generated outputs may not always be accurate, complete, or reliable. You acknowledge and agree that:

The AI translation and note-writing features are intended to assist users but should not be solely relied upon for critical tasks.
The accuracy of the AI-generated content may vary depending on factors such as the quality of the voice input, language complexity, and context.
The risk of ‘Hallucinations’ is present in the Service, such as in most AI-Systems that perform generalized and various tasks. In this sense, the responses generated by AI might contain false or misleading information presented as facts.
Human oversight and control measures by your side are deemed necessary to ensure that the output is reliable and that it corresponds to your input prompt.

5. Privacy and Data Security

5.1 We are committed to protecting your privacy. By using the Service, you agree to our collection, use, and storage of data in accordance with our Privacy Policies.

5.2 You can find the GWDG Data Protection Notice as well as the Academic Cloud Privacy Notice here:

6. Intellectual Property

6.1 We own all intellectual property rights in the Service, including but not limited to software, AI algorithms, trade secrets and generated content. You are granted a limited, non-exclusive, non-transferable license to use the Service for its intended purposes.

6.2 Users are required to adhere to copyright and proprietary notices and licenses, preventing the unauthorized distribution or reproduction of copyrighted content. We reserve the right to remove or block any content believed to infringe copyrights and to deactivate the accounts of alleged offenders.

7. Liability of Users of Service

7.1 The Users are liable for all damages and losses suffered by the Platform Operator as a result of punishable or unlawful use of the Service or of the authorization to use the Service, or through a culpable breach of the User’s obligations arising from these Terms.

7.2 Users shall also be liable for damage caused by use by third parties within the scope of the access and usage options granted to them, insofar as they are accountable for this third-party use, in particular if they have passed on their login credentials to third parties.

7.3 If the Platform Operator is held liable by third parties for damages, default or other claims arising from unlawful or criminal acts of the Users, the Users shall indemnify the Platform Operator against all resulting claims. The Platform Operator shall sue the Users if the third party takes legal action against the Platform Operator on the basis of these claims.

7.4 Users shall be solely liable for the content they upload and generate themselves (User-generated-content) through the use of VoiceAI. In this sense, Platform Operator does not bear any liability for violations of law occurring by such content.

8. Liability of Platform Operator

8.1 The Platform Operator does not guarantee that the Service will function uninterrupted and error-free at all times, neither expressly nor implicitly, and hereby rejects this. The loss of data due to technical faults or the disclosure of confidential data through unauthorized access by third parties cannot be excluded.

8.2 The Platform Operator is not liable for the contents, in particular for the accuracy, completeness or up-to-date validity of information and data or of the output; it merely provides access to the use of this information and data.

8.3 The Platform Operator maintains a mere technical, automatic and passive stance towards the content contributed by Users and does not play any active role in controlling, initiating or modifying that content and therefore cannot be held liable for cases where such content is unlawful.

8.4 The Platform Operator shall only be liable in the event of intent or gross negligence by its employees, unless material duties are culpably breached, compliance with which is of particular importance for achieving the purpose of the contract (cardinal obligations). In this case, the liability of the Platform Operator is limited to the typical damage foreseeable at the time of conclusion of the mutual contract of use, unless there is intent or gross negligence.

8.5 Any administrative accountability towards the Platform Operator remains unaffected by the above provisions.

8.6 User claims for damages are excluded. Exempted from this are claims for damages by users arising from injury to life, physical integrity, health or from the breach of essential contractual obligations (cardinal obligations), as well as liability for other damages based on an intentional or grossly negligent breach of duty by the platform operator, its legal representatives or vicarious agents. Essential contractual obligations are those whose fulfillment is necessary to achieve the objective of the contract.

9. Termination We reserve the right to suspend or terminate your access to the Service at any time, without notice, for any reason, including but not limited to your breach of these Terms.

10. Changes to Terms We may modify these Terms at any time. Any changes will be effective immediately upon posting the revised Terms. Your continued use of the Service after the posting of changes constitutes your acceptance of the modified Terms.

11. Final Provisions

11.1 The Terms shall remain binding and effective in their remaining parts even if individual parts are legally invalid. The invalid parts shall be replaced by the statutory provisions, where applicable. However, if this would constitute an unreasonable burden for one of the contracting parties, the contract as a whole shall become invalid.

11.2 If you have any questions about these Terms, please contact us at:

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Burckhardtweg 4
37077 Göttingen
Deutschland
Tel.: +49 551 39-30001
E-Mail: support@gwdg.de

By using the Service, you acknowledge that you have read, understood, and agree to be bound by these Terms.

AI_Compliance (Recommendation): Display an AI-Info Notice in the User-Interface

Info

To demonstrate compliance with the Transparency Obligation of Art 50(1) of AI Act = adequately inform the User that content is AI generated ▌information can either be provided in the Terms, or even better within the User-Interface itself = highest level of Transparency demonstrated, as the average user will understand it.

e.g., VoiceAI may display inaccurate info, including wrongful translations or inaccurate notes, so double-check its responses. Check the Terms and the Privacy Notice. e.g., AI-generated content. VoiceAI can make mistakes. Check important info.

Impressum

Datenschutzhinweis

Verantwortlicher für die Datenverarbeitung

Der Verantwortliche für die Datenverarbeitung im Sinne der Datenschutz-Grundverordnung und anderer nationaler Datenschutzgesetze der Mitgliedsstaaten sowie sonstiger datenschutzrechtlicher Bestimmungen ist:

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Burckhardtweg 4
37077 Göttingen
Deutschland
Tel: +49 (0) 551 39-30001
E-Mail: support@gwdg.de
Website: www.gwdg.de

Ansprechpartner / Datenschutzbeauftragter

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Datenschutzbeauftragter
Burckhardtweg 4
37077 Göttingen
Deutschland
E-Mail: support@gwdg.de

Allgemeines zur Datenverarbeitung

Umfang der Verarbeitung personenbezogener Daten

Wir verarbeiten personenbezogene Daten unserer Nutzer grundsätzlich nur, soweit dies zur Bereitstellung einer funktionsfähigen Website sowie unserer Inhalte und Leistungen erforderlich ist. Die Verarbeitung personenbezogener Daten unserer Nutzer erfolgt regelmäßig nur nach Einwilligung des Nutzers (Art. 6 Abs. 1 lit. a DSGVO). Eine Ausnahme gilt in solchen Fällen, in denen eine vorherige Einholung einer Einwilligung aus tatsächlichen Gründen nicht möglich ist und die Verarbeitung der Daten durch gesetzliche Vorschriften gestattet ist.

Rechtsgrundlage für die Verarbeitung personenbezogener Daten

Datenschutzhinweis zur Nutzung von Voice-AI-Diensten

Beschreibung und Umfang der Datenverarbeitung

Jedes Mal, wenn unsere Website aufgerufen wird, erfasst unser System automatisch Daten und Informationen vom Computersystem des aufrufenden Rechners.

Um die Voice-AI-Dienste, die von der GWDG gehostet werden, zu nutzen, werden Benutzeranfragen/Eingaben von der Website erfasst und auf den HPC-Ressourcen verarbeitet. Der Schutz der Privatsphäre von Benutzeranfragen ist für uns von grundlegender Bedeutung. Aus diesem Grund speichert unser Dienst weder Ihre Audio-Datei noch die BBB-Konversation, noch werden Anfragen oder Antworten auf einem dauerhaften Speicher abgelegt. Die einzige Ausnahme ist die BBB-Transkription, die in Etherpad geschrieben und mit lokalem MySQL verwendet wird und >500 MB Audio-Transkriptionsergebnisse in unserem GWDG-Speicher enthält, die beide nach 30 Tagen gelöscht werden. Die Anzahl der Anfragen pro Benutzer und die entsprechenden Zeitstempel werden aufgezeichnet, damit wir die Nutzung des Systems überwachen und die Abrechnung durchführen können.

Die folgenden Daten werden gespeichert, um den Dienst zu erfüllen:

Datum des Zugriffs
Name des auf dem zugreifenden Gerät installierten Betriebssystems
Name des verwendeten Browsers
Quellsystem, über das der Zugriff erfolgt ist
Die IP-Adresse des zugreifenden Geräts
Die Daten werden auch in den Log-Dateien unseres Systems gespeichert. Diese Daten werden nicht zusammen mit anderen personenbezogenen Daten des Nutzers gespeichert.

Datenverarbeitung bei der Erstellung von Konten

Wenn ein Konto erstellt wird, wird das sogenannte “Double-Opt-In”-Verfahren verwendet. Dies bedeutet, dass nach Ihrer Registrierung ein E-Mail an die von Ihnen angegebene E-Mail-Adresse gesendet wird, die einen Link enthält, den Sie aufrufen müssen, um die Erstellung dieses Kontos zu bestätigen.

Die folgenden Daten, zusätzlich zu den oben genannten, werden gespeichert, wenn ein Konto erstellt wird:

E-Mail-Adresse
Name und Vorname
Mobiltelefonnummer (wenn angegeben)
Datum und Uhrzeit der Registrierung und Bestätigung

Die folgenden Daten können optional von Ihnen nach der Erstellung des Kontos bereitgestellt werden:

Zusätzliche E-Mail-Adresse(n)
Anrede und Titel
Geburtsdatum
Zusätzliche Telefonnummer(n)
Postanschrift(en)
Sicherheitsspezifische Einstellungen (Sicherheitsfragen und Antworten; Zwei-Faktor-Authentifizierung)

Jedes Mal, wenn Sie sich mit einem bestehenden Konto auf unserer Website anmelden, erfasst unser System automatisch weitere Daten auf der Grundlage der vorherigen Informationen. Die folgenden Daten werden während der Aktionen im angemeldeten Zustand erfasst:

Datum des Zugriffs
Zweck oder Aktion auf der Website (z.B. Ändern/Neu setzen von Passwörtern; fehlgeschlagene Anmeldeversuche usw.)
Name des auf dem zugreifenden Gerät installierten Betriebssystems
Name des verwendeten Browsers
Quellsystem, über das der Zugriff erfolgt ist
Die IP-Adresse des zugreifenden Geräts, wobei die letzten beiden Bytes vor der ersten Speicherung maskiert werden (Beispiel: 192.168.xxx.xxx). Die abgekürzte IP-Adresse kann nicht mit dem zugreifenden Computer in Verbindung gebracht werden.
Eine Schätzung des Standorts des zugreifenden Clients auf der Grundlage der IP-Adresse

Rechtsgrundlage für die Datenverarbeitung

Die Rechtsgrundlage für die vorübergehende Speicherung von Daten und die Log-Dateien ist Art. 6 Abs. 1 lit. f DSGVO.

Zweck der Datenverarbeitung

Die Erfassung von Benutzeranfragen über unsere Website und die Verarbeitung von Benutzeranfragen auf unserem HPC-System ist notwendig, um eine Antwort mit dem ausgewählten Voice-AI-Dienst generieren zu können.

Die Speicherung in Log-Dateien erfolgt, um die Funktionalität der Website sicherzustellen. Darüber hinaus hilft uns die Daten, um die Website zu optimieren und die Sicherheit unserer IT-Systeme zu gewährleisten. Eine Auswertung der Daten für Marketingzwecke findet in diesem Zusammenhang nicht statt. Diese Zwecke stellen auch unser berechtigtes Interesse an der Datenverarbeitung gemäß Art. 6 Abs. 1 lit. f DSGVO dar.

Speicherdauer

Die Eingabe wird nur auf dem GWDG-Server während des Inferenzprozesses selbst gespeichert. Die Ausgabedaten im txt, vtt, oder srt-Format bleiben jedoch einen Monat lang auf dem GWDG-Server gespeichert, sodass die Benutzer innerhalb dieses Monats Zeit haben, ihre Ergebnisse herunterzuladen.

Darüber hinaus wird ein Protokoll geführt, das die Anzahl der Anfragen pro Benutzer und die entsprechenden Zeitstempel enthält. Die Protokolle werden gemäß den Richtlinien der GWDG gespeichert.

Widerspruchs- und Beseitigungsmöglichkeit

Die Erfassung von Daten für die Bereitstellung der Website und die Speicherung von Daten in Log-Dateien ist absolut notwendig für den Betrieb der Website. Folglich besteht keine Möglichkeit für den Nutzer, Widerspruch einzulegen.

Rechte der betroffenen Personen

Sie haben verschiedene Rechte in Bezug auf die Verarbeitung Ihrer personenbezogenen Daten. Wir listen sie im Folgenden auf, zusätzlich sind Verweise auf die Artikel (DSGVO) bzw. Paragraphen (BDSG (2018)) mit detaillierteren Informationen angegeben.

Recht auf Auskunft durch die betroffene Person (Art. 15 DSGVO; § 34 BDSG)

Recht auf Berichtigung (Art. 16 DSGVO)

Recht auf Löschung / “Recht auf Vergessen werden” / Recht auf Einschränkung der Verarbeitung (Art. 17, 18 DSGVO; § 35 BDSG)

Sie haben das Recht, die unverzügliche Löschung Ihrer personenbezogenen Daten vom Verantwortlichen zu verlangen. Alternativ können Sie die Einschränkung der Verarbeitung vom Verantwortlichen verlangen. Einschränkungen sind in der DSGVO und dem BDSG unter den genannten Artikeln bzw. Paragraphen genannt.

Recht auf Unterrichtung (Art. 19 DSGVO)

Recht auf Datenübertragbarkeit (Art. 20 DSGVO)

Sie haben das Recht, die Sie betreffenden personenbezogenen Daten, die Sie dem Verantwortlichen bereitgestellt haben, in einem strukturierten, gängigen und maschinenlesbaren Format zu erhalten. Ergänzend zu DSGVO ist festzustellen, dass sich die Datenübertragbarkeit bei Massendaten / Nutzerdaten ausschließlich auf die technische Lesbarkeit beschränkt. Das Recht auf Datenübertragbarkeit umfasst nicht, dass die vom Nutzer in einem proprietären Format erstellen Daten vom Verantwortlichen in ein “gängiges”, d.h. standardisiertes Format konvertiert werden.

Widerspruchsrecht (Art. 21 DSGVO; § 36 BDSG)

Sie haben das Recht, Widerspruch gegen die Verarbeitung einzulegen, wenn diese ausschließlich auf Basis einer Abwägung des Verantwortlichen geschieht (vgl. Art. 6 Abs. 1 lit f DSGVO). Recht auf Widerruf der datenschutzrechtlichen Einwilligungserklärung (Art. 7 Abs. 3 DSGVO) Sie haben das Recht, Ihre datenschutzrechtliche Einwilligungserklärung jederzeit zu widerrufen. Durch den Widerruf der Einwilligung wird die Rechtmäßigkeit der aufgrund der Einwilligung bis zum Widerruf erfolgten Verarbeitung nicht berührt.

Recht auf Widerruf der datenschutzrechtlichen Einwilligungserklärung (Art. 7 Abs. 3 DSGVO)

Sie haben das Recht, Ihre Erklärung zur datenschutzrechtlichen Einwilligung jederzeit zu widerrufen. Der Widerruf der Einwilligung berührt nicht die Rechtmäßigkeit der Verarbeitung, die aufgrund der Einwilligung vor dem Widerruf erfolgt ist.

Recht auf Beschwerde bei einer Aufsichtsbehörde (Art. 77 DSGVO)

Unbeschadet eines anderweitigen verwaltungsrechtlichen oder gerichtlichen Rechtsbehelfs steht Ihnen das Recht auf Beschwerde bei einer Aufsichtsbehörde, insbesondere in dem Mitgliedstaat Ihres Aufenthaltsorts, Ihres Arbeitsplatzes oder des Orts des mutmaßlichen Verstoßes, zu, wenn Sie der Ansicht sind, dass die Verarbeitung der Sie betreffenden personenbezogenen Daten gegen die DSGVO verstößt.

Die Aufsichtsbehörde für die Verarbeitung personenbezogener Daten durch die GWDG ist die folgende:

Landesbeauftragte für den Datenschutz Niedersachsen
Postfach 221, 30002 Hannover
E-Mail: poststelle@lfd.niedersachsen

Nutzungsbedingungen (Empfehlung)

AGBs | Nutzungsbedingungen für Voice-AI

1. Einführung

1.1 Willkommen bei unserem Voice-AI-Service (der “Service”). Durch die Nutzung des Service stimmen Sie zu, die folgenden Nutzungsbedingungen (“Bedingungen”) einzuhalten und sich daran zu binden. Die Bedingungen regeln die Geschäftsbeziehung zwischen dem Plattformbetreiber (nachfolgend als “Plattformbetreiber” bezeichnet) und den Nutzern (nachfolgend als “Nutzer” bezeichnet) von VoiceAI.

1.2 Plattformbetreiber ist die GWDG; Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen.

1.3 Die Bedingungen gelten ab dem Zeitpunkt der Registrierung der Nutzer in der zum Zeitpunkt der Registrierung gültigen Fassung. Der Plattformbetreiber kann diese Bedingungen jederzeit in der Zukunft aktualisieren.

1.4 Bitte lesen Sie diese Bedingungen sorgfältig durch, bevor Sie den Service nutzen. Durch die Nutzung des Service bestätigen Sie, dass Sie diese Bedingungen gelesen und akzeptiert haben.

2. Service-Beschreibung

2.1 Der Service bezieht sich auf die Application Programming Interface (VoiceAI API) des Plattformbetreibers (GWDG) für angemeldete Nutzer von AcademicCloud, wie im Folgenden beschrieben.

2.2 Der Service bietet Nutzern die Möglichkeit:

Sprachströme zu übersetzen: Mit Hilfe von KI-Technologie Sprachströme in Text in verschiedenen Sprachen zu übersetzen.
Notizen aus Meetings zu erstellen: Automatisch geschriebene Notizen aus der Stimmeingabe des Nutzers in Online-Meetings mit Hilfe von KI-Technologie zu erstellen.

2.3 Service-Spezifikation: Der Service besteht aus zwei Hauptteilen:

Verarbeitung von hochgeladenen Audio-Dateien: Dieser Teil verarbeitet Audio-Dateien, die von den Nutzern hochgeladen werden.
Verarbeitung von Streaming-Audio aus BBB: Dieser Teil erfasst und verarbeitet Streaming-Audio aus BigBlueButton (BBB).

Wir bieten einen Webserver über den GWDG-ESX-Service an, der ein Webinterface hostet. Nutzer können sich über SSO anmelden und über eine Benutzeroberfläche mit dem System interagieren. Das Web-Frontend verbindet sich mit einem Chat-AI-Kong, der dann mit einem HPC-Proxy und anschließend mit einem Funktionskonto über SSH verbindet. Kong ist für die Verwaltung dieser Verbindungen verantwortlich.

Kong Gateway ist ein Cloud-native-API-Gateway, das als Reverse-Proxy fungiert, um Anfragen an APIs zu verwalten, zu konfigurieren und zu routen. Es ist für hybride und Multi-Cloud-Umgebungen konzipiert, optimiert für Microservices und verteilte Architekturen. Kong bietet die notwendige Flexibilität und Skalierbarkeit, um den API-Verkehr effizient zu bewältigen.

Der Funktionskonto kann nur einen einzelnen Befehl ausführen, “cloud_interface.sh”, der ein Bash-Skript ist, das curl verwendet, um eine REST-Anfrage an einen Rechenknoten zu senden.

Für die Audio-Streaming-Verarbeitung joinet der Bot den Raum als Listener, benachrichtigt die BBB-Teilnehmer über die Aufzeichnung, erfasst Audio über WebRTC und sendet die Audio-Frames durch das oben beschriebene Verfahren. Ein Proxy verwaltet die WebSocket- (WS-) Verbindungen, die für die Audio-Streaming-Verarbeitung notwendig sind.

WebSocket wird verwendet, weil es einen persistenten, bidirektionalen Kommunikationskanal über eine einzelne TCP-Verbindung bietet, der für die Echtzeit-Audio-Streaming-Verarbeitung unerlässlich ist.

Ein Uvicorn-Server mit dem Modell läuft auf dem jeweiligen Rechenknoten, um das Ergebnis zu berechnen. Diese HTTP-Verbindung findet nur innerhalb des isolierten HPC-Netzwerks statt. Der Server auf dem Rechenknoten antwortet mit einem Stream an den Webdienst, der in der Cloud läuft, über die aktive SSH-Verbindung. Die Antwort wird in Etherpad Lite geschrieben. Etherpad Lite ist ein Open-Source-Tool für die Echtzeit-Kollaboration, das es mehreren Benutzern ermöglicht, Dokumente gleichzeitig in ihren Browsern zu bearbeiten.

Info

Für Abrechnungszwecke werden auf dem HPC-Login-Node Protokolle geschrieben, die den Benutzernamen, den Zeitstempel und die Inferenz-ID am Anfang und Ende jeder Anfrage aufzeichnen. Ebenso werden auf dem Cloud-/ESX-Server der Benutzername, die E-Mail-Adresse und der Zeitstempel für jede Anfrage protokolliert.

Wenn viele Anfragen erhalten werden, sind mehrere Rechenknoten erforderlich, um die Antworten rechtzeitig zu senden. Ein automatisches Skalierungsmechanismus wurde implementiert, um die Anzahl der gleichzeitigen Anfragen innerhalb der letzten 5 Sekunden zu bestimmen und die Anzahl der Rechenknoten entsprechend anzupassen. Wenn notwendig, werden neue Jobs an Slurm gesendet. Dies wird von scheduler.py überwacht, der während des SSH-Keep-Alive ausgeführt wird.

3. Nutzerpflichten

3.1 Durch die Nutzung des Service stimmen Sie zu:

genaue und rechtmäßige Sprachinputs für die Verarbeitung durch den Service bereitzustellen.
sicherzustellen, dass Ihre Nutzung des Service nicht gegen anwendbare Gesetze, Vorschriften oder Rechte Dritter verstößt.
sicherzustellen, dass die Nutzung gemäß diesen Bedingungen und in Verbindung mit den Nutzungsbedingungen von Academic Cloud erfolgt, wo Ihre spezifischen Rechte und Pflichten festgelegt sind. Letztere können hier gefunden werden: AcademicCloud | Nutzungsbedingungen.
den Service nicht auf eine Weise zu missbrauchen, die den Service schädigen, deaktivieren, überlasten oder beeinträchtigen könnte.

4. AI-Ausgabe

4.1 Durch die Nutzung des Service sind Sie sich bewusst, dass die Ausgabe AI-generiert ist. Der Service verwendet fortschrittliche künstliche Intelligenz (AI)-Technologie, um Sprachinputs zu übersetzen und geschriebene Notizen zu generieren.

4.2 Die AI-generierten Ausgaben sind jedoch nicht immer genau, vollständig oder zuverlässig. Sie erkennen an und stimmen zu, dass:

Die AI-Übersetzungs- und Notizschreibfunktionen dazu bestimmt sind, Benutzer zu unterstützen, aber nicht allein für kritische Aufgaben verwendet werden sollten.
Die Genauigkeit des AI-generierten Inhalts kann je nach Faktoren wie der Qualität der Sprachinput, der Sprachkomplexität und dem Kontext variieren.
Das Risiko von “Halluzinationen” ist im Service vorhanden, wie in den meisten AI-Systemen, die allgemeine und verschiedene Aufgaben ausführen. In diesem Sinne können die von der AI generierten Antworten falsche oder irreführende Informationen enthalten, die als Tatsachen präsentiert werden.
Menschliche Überwachung und Kontrollmaßnahmen auf Ihrer Seite sind notwendig, um sicherzustellen, dass die Ausgabe zuverlässig ist und dass sie Ihrem Eingabeprompt entspricht.

5. Datenschutz und Datensicherheit

5.1 Wir sind bestrebt, Ihre Privatsphäre zu schützen. Durch die Nutzung des Service stimmen Sie zu, dass wir Ihre Daten in Übereinstimmung mit unseren Datenschutzbestimmungen sammeln, verwenden und speichern.

5.2 Sie können die GWDG-Datenschutzerklärung sowie die Academic Cloud-Datenschutzerklärung hier finden:

6. Geistiges Eigentum

6.1 Wir besitzen alle Rechte am geistigen Eigentum im Service, einschließlich, aber nicht beschränkt auf Software, AI-Algorithmen, Geschäftsgeheimnisse und generierten Inhalt. Ihnen wird eine begrenzte, nicht-exklusive, nicht-übertragbare Lizenz gewährt, den Service für seine vorgesehenen Zwecke zu nutzen.

6.2 Nutzer sind verpflichtet, Urheberrechts- und Eigentumsvermerke sowie Lizenzen zu beachten, um die unbefugte Verbreitung oder Vervielfältigung urheberrechtlich geschützten Inhalts zu verhindern. Wir behalten uns das Recht vor, jeden Inhalt zu entfernen oder zu blockieren, der gegen Urheberrechte verstoßen soll, und die Konten von mutmaßlichen Verletzern zu deaktivieren.

7. Haftung der Nutzer des Dienstes

7.1 Die Nutzer sind für alle Schäden und Verluste, die dem Plattformbetreiber durch die Nutzung des Dienstes oder durch die Autorisierung zur Nutzung des Dienstes entstehen, oder durch eine schuldhafte Verletzung der Nutzerpflichten aus diesen Bedingungen, verantwortlich.

7.2 Nutzer sind auch für Schäden verantwortlich, die durch die Nutzung durch Dritte im Rahmen der ihnen gewährten Zugriffs- und Nutzungsrechte entstehen, soweit sie für diese Drittnutzung verantwortlich sind, insbesondere wenn sie ihre Anmeldedaten an Dritte weitergegeben haben.

7.3 Wenn der Plattformbetreiber von Dritten für Schäden, Verbindlichkeiten oder andere Ansprüche, die aus rechtswidrigen oder strafbaren Handlungen der Nutzer resultieren, in Anspruch genommen wird, haben die Nutzer den Plattformbetreiber gegen alle daraus resultierenden Ansprüche schadlos zu halten. Der Plattformbetreiber wird die Nutzer verklagen, wenn ein Dritter gegen den Plattformbetreiber aufgrund dieser Ansprüche rechtliche Schritte einleitet.

7.4 Nutzer sind allein für den Inhalt, den sie selbst hochladen und generieren (Benutzergenerierter Inhalt), durch die Nutzung von VoiceAI verantwortlich. In diesem Sinne übernimmt der Plattformbetreiber keine Haftung für Rechtsverletzungen, die durch diesen Inhalt entstehen.

8. Haftung des Plattformbetreibers

8.1 Der Plattformbetreiber garantiert nicht, dass der Dienst jederzeit unterbrechungsfrei und fehlerfrei funktioniert, weder ausdrücklich noch stillschweigend, und lehnt dies hiermit ab. Der Verlust von Daten aufgrund technischer Fehler oder die Offenlegung vertraulicher Daten durch unbefugten Zugriff Dritter kann nicht ausgeschlossen werden.

8.2 Der Plattformbetreiber ist nicht für den Inhalt, insbesondere für die Richtigkeit, Vollständigkeit oder Aktualität von Informationen und Daten oder der Ausgabe, verantwortlich; er stellt lediglich den Zugriff auf die Nutzung dieser Informationen und Daten bereit.

8.3 Der Plattformbetreiber nimmt gegenüber dem von den Nutzern bereitgestellten Inhalt eine rein technische, automatische und passive Haltung ein und spielt keine aktive Rolle bei der Kontrolle, Initiierung oder Modifizierung dieses Inhalts und kann daher nicht für Fälle haftbar gemacht werden, in denen dieser Inhalt rechtswidrig ist.

8.4 Der Plattformbetreiber haftet nur im Falle von Vorsatz oder grober Fahrlässigkeit seiner Mitarbeiter, es sei denn, wesentliche Pflichten werden schuldhaft verletzt, deren Einhaltung für die Erreichung des Vertragszwecks von besonderer Bedeutung ist (Kardinalpflichten). In diesem Fall ist die Haftung des Plattformbetreibers auf den typischen Schaden beschränkt, der bei Vertragsschluss vorhersehbar war, es sei denn, es liegt Vorsatz oder grobe Fahrlässigkeit vor.

8.5 Jegliche verwaltungsrechtliche Verantwortlichkeit gegenüber dem Plattformbetreiber bleibt von den vorstehenden Bestimmungen unberührt.

8.6 Nutzeransprüche auf Schadensersatz sind ausgeschlossen. Ausgenommen hiervon sind Ansprüche auf Schadensersatz der Nutzer, die aus Verletzungen von Leben, körperlicher Unversehrtheit oder Gesundheit oder aus der Verletzung wesentlicher Vertragspflichten (Kardinalpflichten) resultieren, sowie die Haftung für andere Schäden, die auf einer vorsätzlichen oder grob fahrlässigen Pflichtverletzung des Plattformbetreibers, seiner gesetzlichen Vertreter oder Erfüllungsgehilfen beruhen. Wesentliche Vertragspflichten sind solche, deren Erfüllung zur Erreichung des Vertragszwecks notwendig ist.

9. Kündigung Wir behalten uns das Recht vor, Ihren Zugriff auf den Dienst jederzeit ohne Vorankündigung aus beliebigem Grund, einschließlich, aber nicht beschränkt auf Ihre Verletzung dieser Bedingungen, zu suspendieren oder zu beenden.

10. Änderungen der Bedingungen Wir können diese Bedingungen jederzeit ändern. Alle Änderungen werden sofort nach Veröffentlichung der überarbeiteten Bedingungen wirksam. Ihre fortgesetzte Nutzung des Dienstes nach Veröffentlichung der Änderungen stellt Ihre Zustimmung zu den geänderten Bedingungen dar.

11. Schlussbestimmungen

11.1 Die Bedingungen bleiben in ihren verbleibenden Teilen wirksam und verbindlich, auch wenn einzelne Teile rechtswidrig sind. Die unwirksamen Teile werden durch die gesetzlichen Bestimmungen ersetzt, soweit anwendbar. Wenn dies jedoch eine unzumutbare Belastung für eine der Vertragsparteien darstellen würde, wird der Vertrag als Ganzes unwirksam.

11.2 Wenn Sie Fragen zu diesen Bedingungen haben, wenden Sie sich bitte an:

Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen
Burckhardtweg 4
37077 Göttingen
Deutschland
Tel.: +49 551 39-30001
E-Mail: support@gwdg.de

Durch die Nutzung des Dienstes bestätigen Sie, dass Sie diese Bedingungen gelesen, verstanden und sich damit einverstanden erklären, an diese gebunden zu sein.

AI-Compliance (Empfehlung): Anzeige einer AI-Info-Mitteilung in der Benutzeroberfläche

Info

Um die Transparenzpflicht von Art. 50(1) des AI-Gesetzes zu erfüllen = den Benutzer angemessen informieren, dass der Inhalt AI-generiert ist ▌Informationen können entweder in den Bedingungen oder noch besser in der Benutzeroberfläche selbst bereitgestellt werden = höchstes Maß an Transparenz, da der durchschnittliche Benutzer dies verstehen wird.

z. B., VoiceAI kann ungenaue Informationen anzeigen, einschließlich fehlerhafter Übersetzungen oder ungenauer Notizen, daher sollten Sie die Antworten überprüfen. Überprüfen Sie die Bedingungen und die Datenschutzerklärung. z. B., AI-generierter Inhalt. VoiceAI kann Fehler machen. Überprüfen Sie wichtige Informationen.

Impressum

Terms of use

Registration and Access

Authorized usage

Development

Prohibitions

Users are prohibited from using this service for transmission, generating, and distributing content (input and output) that:
- contains confidential or sensitive information;
- violates privacy laws, including the collection or distribution of personal data without consent;
- Is fraudulent, deceptive, harmful, or misleading;
- Is discriminative, promotes violence, hate speech, or illegal activities;
- encourages self-harm, harassment, bullying, violence, and terrorism;
- Is sexually explicit for non-educational or non-scientific purposes, involves child sexual exploitation, misrepresentation, deception, or impersonation;
- promotes illegal activities or infringes on intellectual property rights and other legal and ethical boundaries in online activities;
- Involves any sensitive or controlled data, such as protected health information, personal details, financial records, or research involving sensitive human subjects;
- attempting to bypass our safety measures or prompting actions that violate established policies intentionally;
- could unfairly or adversely impact individuals, particularly concerning sensitive or protected characteristics;
Users are prohibited from doing activities including:
- Reverse engineering, decompiling, or disassembling the technology;
- Unauthorized activities such as spamming, malware distribution, or disruptive behaviors that compromise service quality;
- Modifying, copying, renting, selling, or distributing our service;
- Engaging in tracking or monitoring individuals without their explicit consent.

Termination and suspension

You can terminate your use of our Services and your legal relationship with us at any time by stopping the use of the service. If you’re an EEA-based consumer, you have the right to withdraw these Terms within 14 days of acceptance by contacting Support. We reserve the right to suspend or terminate your access to our service or deactivate your account for violating these Terms or other terms and policies referred to here by you, for the necessity of compliance with the law, or for posing risks or harm to us, users, and others through using. We’ll give you notice before deactivating your account unless it’s not feasible or allowed by law. If you believe your account has been suspended or disabled mistakenly, you can appeal by contacting Support. We reserve the right of legal action to safeguard intellectual property rights and user safety. Civil penalties, damages, administrative fines, criminal charges, or other legal options could be pursued in violation of these terms or by engaging in illegal activities through using this service.

Accuracy

The results generated by our services may not always be unique, and entirely correct or precise. It could contain inaccuracies, even if they seem detailed. Users shouldn’t solely depend on these results without verifying their accuracy independently. Moreover, the transcription provided by our services might not always be complete or accurate. Therefore, users should exercise caution and avoid using the services alone for important decisions. It’s crucial to understand that AI and machine learning are constantly evolving, and although we strive to improve the accuracy and reliability of our services, users should always assess the accuracy of the results and ensure they meet their specific requirements, verifying them with human input before use or distribution. Additionally, users should refrain from using results related to individuals for purposes that could significantly affect them, such as legal or financial decisions. Lastly, users should be aware that incomplete or inaccurate results may occur, which may not necessarily reflect the views of GWDG or its affiliated parties.

Copyright

Feedbacks

Privacy

The privacy of user requests is fundamental to us. As a result, our service does not save your audio/conversation on persistent storage, except for BBB transcriptions (future service), which are stored in local MySQL, and audio file transcription results in our data mover node, both of which will be erased after 30 days. The number of requests for either of the services per user and the respective timestamps are recorded so we can monitor the system’s usage and perform accounting.

For technical purposes, the following data is collected by the webserver:

Date of access
Name of the operating system installed on the accessing device
Name of the browser used
Source system via which the access was made
The IP address of the accessing device

Business Users and Organizations

Project Management

Projects are the primary (and soon to be the only) unit used to manage compute and storage resources. They by extension provide resources to each project’s members. They are organized into a hierarchical tree with each project having zero or more users and zero or more sub-projects. Note that “projects” here represent actual compute projects and several other project-like things:

literal compute projects that were applied for and approved
institutions and workgroups at University Göttingen, MPG, etc.
pseudo-projects for NHR test accounts
Courses and workshops
sub-projects of the above for compartmentalizing resources or better organization (e.g. the project for a workgroup at University Göttingen might have a sub-project for individual student thesis projects)

Giving resources directly to users without being a member of a project has ended for NHR starting 2024/Q2 (legacy users are being grandfathered in for a transitional period) and is being phased out for the SCC.

All projects starting 2024/Q2 are managed using the HPC Project Portal. NHR and SCC projects started before 2024/Q2 must be managed by support ticket (see Getting Started to get the right email address) until they are migrated to the HPC Project Portal, though not all management operations are possible without migrating.

Warning

NHR projects started before 2024/Q2 must migrate on their next renewal, or at an earlier date if the PI/s wish it or need a change to be applied that cannot be done without migrating. See the NHR/HLRN Project Migration page for more information.

Using the HPC Project Portal and/or being added to a project in it requires that one has an AcademicCloud account. All students and researchers of the University of Göttingen, the University Medical Center Göttingen and the Max Planck Institutes should already have a valid account. Researchers from many other institutions can also login via federation login from their home institute with their institutional credentials. It is important to login to the AcademicCloud at least once to be able to use the HPC Project Portal or be added to projects.

See the following sub-pages for individual project management topics:

Project Portal

Beginning in 2024 all the HPC clusters of the GWDG are being unified in one consistent system.

Part of this unification is the HPC Project Portal. The Portal will be used to manage projects on the Scientific Compute Cluster (SCC), the NHR@Göttingen clusters Emmy and Grete, the KISSKI Platform, and others. See the Project Management pages for more detailed project management information.

Academic Cloud

To get added to a portal project, you need to have a valid AcademicCloud account. All students and researchers of the University of Göttingen, the University Medical Center Göttingen and the Max Planck Institutes should already have a valid account. Researchers from many other institutions can also login via Federation from their home institute.

If none of this applies to you, you can also create an unaffiliated AcademicCloud account by registering with your email address, which can then be added to HPC projects. Note that many other GWDG services will not be available for these self-registered accounts.

Project-specific Username

If you get added to a project, you will receive a project-specific login username via email. You can also see this login username in the portal itself if you navigate to any project you are a member of.

This project-specific username has to be used to access the HPC login nodes via SSH. Note that it is not valid for any non-HPC services of the GWDG, in particular it will not work with the GWDG VPN, the login.gwdg.de and transfer.gwdg.de servers, the Jupyter Cloud service or the samba filesharing service.

Please always include your project-specific username if you contact the support team.

User Perspective

HPC Project Portal

The HPC Project Portal can be used to:

See all projects one is a member of, and very importantly, get the following information:
- Your project-specific username for each project (it is also sent via the portal’s notification email when being added to a project).
- The HPC Project ID of the project:
  - Which is the directory name of the project’s directories in the data stores.
  - The name of the project’s POSIX group will either be this value prefixed by HPC_ or the value itself if the former does not exist.
- The project path, which is where the project sits in the project tree. The project’s Project Map is at /projects/PROJECTPATH (you can also use the symlink ~/.project in your HOME directory to get this).

The front page of the HPC Project Portal where you login is shown below.

Screenshot of the front page (where one logs in) of the HPC project portal. — HPC Project Portal Front Page
Login/front page of the HPC Project Portal.

If you click the icon of a person’s head and shoulders in the top right of the page, you get a menu where you can logout as well as a convenience link to the Academic Cloud security page where you can manage your SSH keys (see Uploading SSH Keys for more information). The menu is shown in the screenshot below.

Screenshot of the menu for logging out and SSH key management, gotten by clicking the icon of a person's head and shoulders at the top-right of the page. The user's name is blurred. — Logout and SSH Key Management Menu
Menu providing logout and the SSH key management link.

Browsing Projects

When you first log in, you are brought to the page for browsing active projects that you are in, which is shown below.

Screenshot of the page for browsing active projects, which is under Active Projects -> Overview in the nav-bar on the left. Three projects are shown with their names, descriptions, how many sub-projects they have, start and end dates, and whether there are pending join requests. — Browsing Active Projects Page
Page for browsing active projects the user is in.

There are three different pages for browsing projects, which can be accessed from the nav-bar on the left.

Active projects you are in is under Active Projects -> Overview.
Open projects (publicly viewable projects that anyone can request to join) is under Open Projects.
Projects which have expired or otherwise been deactivated are under Archived Projects, which is shown in the screenshot below.

Screenshot of the page for browsing inactive projects (archived), which is under Archived Projects in the nav-bar on the left. Three projects are shown on the first page with their names, descriptions, PI/s (usernames blurred), and start and end dates. Buttons for which page, next page, and last page are both above and below the project list. — Browsing Archived Projects Page
Page for browsing inactive (archived) projects the user is in.

On all browsing pages, the projects are shown as a list of boxes with the following elements:

Project name in the top left
Project start and end dates are in the top right
Project description below the project name
When the project was last updated/changed below the project description
For active projects, the number of subprojects below when the project was last updated
For archived/inactive projects, the Academic Cloud ID names of the project PI/s below indicate when the project was last updated
The View button at the bottom left to go to the project view page
For active projects where you have the PI or Delegate role, if there are any pending join requests, this is shown at the bottom right.

See the zoomed in view of one project in the browsing list below.

Zoomed in screenshot showing the project name (NHR Internal), the start and end dates (2024-03-26 to 2030-12-31), the project description (Parent project of internal NHR projects), when it was last updated (2024-03-26, 16:14:15), how many subprojects it has (zero), the View button to go to the project view page, and that there is one visitor who has requested to join the project. — Zoom In on One Project While Browsing Active Projects Page
Zoom in on one project in the project list when browsing active projects.

Viewing a Project

Clicking a project’s View button when browsing projects brings up the project’s page, which is shown below for the case where the user is part of the project with the role of Member.

Screenshot of a project page showing the project name (Performance Engineering), HPC Project ID (performance_engineering), the project start and end dates (2024-02-16 to 2025-12-31), the project description (This project has been created for running benchmarks and analyzing computational performance of selected applications on the HPC systems), the project PIs (names blurred), the project path (intern/performance_engineering), the active project members list (names, usernames, when they were added, project-specific usernames, and their roles), the deactivated project members list (name, username, deactivation time, project-specific username, and role), and the list of joing requests that haven't been accepted or rejected yet. Names, usernames, and email addresses are blurred. — Project Page
Project page for a project the user is a member of. Names, usernames, and email addresses are blurred.

The project is shown in several boxes. If the project has sub-projects, the first box will be a list of the subprojects. Then, the next box is much like what is shown when browsing projects and includes the following information

In the top row
- Project name
- HPC Project ID
- “View the project actions file” button to see the history of the actions taken on the project
- Project start and end dates
In the rest of the box from top to bottom:
- Project description
- Names of the PI/s
- Project path (where it is in the project tree, see Project Structure for more information)
- The Join button if it is an open project you aren’t a part of yet, which lets you make a request to join the project
- Additional management buttons if you have the Delegate or PI role (see Management Interface for more information)

A box of the active users/members of the project. They are presented as a table with each user’s name, Academic Cloud username, when they were added to the project, their project-specific username, and their role. If you have the PI or Delegate role, you will also see an icon for the user management menu on the right side of each user’s row (see Management Interface for more information).

A box of the project’s deactivated users. They are presented as a table of each user’s name, Academic Cloud username, when they were deactivated, their project-specific username, and their role. If you have the PI or Delegate role, you will also see an icon for the user management menu on the right side of each user’s row (see Management Interface for more information).

The last box shows the visitors who have requested to join the project in a table. If you have the PI or Delegate role, you will also see an icon for the user management menu on the right side of each user’s row (see Management Interface for more information).

Requesting to Join a Project

You can request to join a project by

Using the join link given to you by a PI or Delegate of the project
Clicking the Join button on an open project you aren’t a member of yet (only for administrators)

Using either, you are taken to the join request page, which is shown in the screenshot below. You should enter information in the provided text box telling the project PI/s and/or Delegate/s important context for your joining (e.g. for an open project, this could include indicating more about who you are and why you want to join), and then click Yes or Cancel.

Screenshot of making a join request for the Test Accounts Niedersachsen project, which includes a text box for giving the reason, a Yes button to send the request, and a Cancel button to not send the request. — Making a Join Request
Page for making a join request for a project.

Unsubscribing from the HPC News Mailing Lists

There are multiple mailing lists that you get subscribed to when you are getting HPC access. The hpc-anncounce-* email lists contain import system information and maintenance information. They are important to all users and unsubscribing from these lists is only possible if your account is deactivated for HPC.

In addition to these technical information lists there are HPC-news-* mailing lists. These lists contain information about new trainings and courses. If you do not wish to receive this information you can unsubscribe when you login to the HPC Project Portal. Click on the user icon in the top right corner and click on “Account” from the dropdown:

Screenshot of the user account menu. — Open User Account Settings
Access the user account page

On the settings page you can disable the slider “Recieve HPC news Emails”.

Screenshot of the user account settings page. — User Account Settings Page
Settings page with mailing list opt out switch

Managing a Project

A project’s members, description, and visibility can be managed in the HPC Project Portal. Management is restricted to members with the roles of PI and Delegate. For other operations and any problems, you can create a support ticket (see Getting Started to get the right email address). To manage projects in the HPC Project Portal, login and find the project you want to manage (see User Perspective). The project view and management actions are detailed in the sections below, with convenience links here:

Note

Changes made in the HPC Project Portal do not take effect instantly on the clusters themselves. In the best case, it takes a few minutes after the first pause in doing management actions.

In the worst case, one action may jam on the cluster preventing the later ones from being performed. We monitor the HPC Project Portal system for these jams but may not notice them right away. If you notice that the changes haven’t been applied to the cluster after a long time and it hasn’t resolved, create a support ticket to bring it to our attention (see Getting Started to get the right email address).

Project View

The project view is the same as for users with the role Member (see User Perspective) except that extra management features are shown. An example is shown in the screenshot below.

Screenshot of a project page for a PI showing the that there is one visitor requesting access, project name (NHR Internal), HPC Project ID (nhr_internal), the project start and end dates (2024-03-26 to 2030-12-31), the project description (Parent project of internal NHR projects), the project PIs (names blurred), the project path (extern/nhr/nhr_internal), the button to get the join link for the project, an Edit button to edit the project, a button to show the form for adding more users, the active project members list (names, usernames, when they were added, project-specific usernames, their roles, and management buttons), the deactivated project members list (name, username, deactivation time, project-specific username, role, and management buttons), and the list of joing requests that haven't been accepted or rejected yet. Names, usernames, and email addresses are blurred. — Project Page for a PI
Project page for a project the user is a PI in. Names, usernames, and email addresses are blurred.

The project is shown in several boxes. If the project has sub-projects, the first box will be a list of the sub-projects. Then, a yellow box is shown if there are any pending join requests. Then, the next box is much like what is shown when browsing projects and includes the following information:

In the top row
- Project name
- HPC Project ID
- “View the project actions file” button to see the history of the actions taken on the project
- Project start and end dates
In the rest of the box from top to bottom:
- Project description
- Names of the PI/s
- Project path (where it is in the project tree, see Project Structure for more information)
- The “Share join link for this project” button to get the join link so you can share it with others
- The Edit button for changing the project description, making it an open/closed project, etc.

A box for adding new users to the project. It has a single button “Show form” which will make the form to add users visible.

A box of the active users/members of the project. They are presented as a table with each user’s name, Academic Cloud username, when they were added to the project, their project-specific username, and their role. On the right side of each row, an icon (three vertical dots) for the user management menu will appear if you have sufficient permissions to manage that user.

A box of the project’s deactivated users. They are presented as a table of each user’s name, Academic Cloud username, when they were deactivated, their project-specific username, and their role. On the right side of each row, an icon (three vertical dots) for the user management menu will appear if you have sufficient permissions to manage that user.

The last box shows the visitors who have requested to join the project in a table. On the right side of each row, there is an icon (three vertical dots) for processing the join request.

Adding a User

Users are always added with the Member role initially. You can change their role after they have been added.

In the “Add users to project” box, click on the “Show form” button to make the forms to add a user visible. Then add the Academic Cloud usernames (e.g. johndoe1) or the email addresses associated with them (e.g. example.user@uni-goettingen.de) for each user you want to add in the provided text box, one user per line. Then click the “Add users to be member” box. This is shown in the screenshot below.

Screenshot of a form to add users to the project, which consists of a text box to enter Academic Cloud usernames or their associated email addresses and a button to add users with the Member role. — Add User Form
Forms to add a user to the project, consisting of a text box to enter the Academic Cloud usernames or associated email addresses of the users and a button to add them.

Then, the project page reloads, and a success/failure box will eventually appear at the top of the page for each added user, which can be closed by clicking the “X” button on the right side of each. See the screenshot below for an example.

Screenshot of notification at the top of the page for attempting to add a user. The notification indicates that the user was successfully added, and a confirmation email sent to their email address. — Add User Success Notification
Success notification for adding a user to the project.

Common reasons adding a user can fail:

Person has not yet logged into the Academic Cloud at least once.
The email address provided is not associated with the person’s Academic Cloud account.
The Academic Cloud username does not exist
A deactivated account is added, i.e. the account is locked. The portal will give an error message about it.
The portal is overloaded

For the first, once the person has logged in at least once, you can attempt to add them again. For the second and third, check the email address or Academic Cloud username for typos, ask the person for the correct email address, or use their Academic Cloud username and then attempt to add them again. For the last one, it can help to resubmit the add request.

Changing a User’s Role

To change an active user’s role, click the icon (three vertical dots) on the right hand side of the user’s row in the table to get the menu containing the management options, which are shown in the screenshot below. The icon will not appear if you lack the permissions to manage the user.

Screenshot of the user management menu for an active user showing actions Set to PI, Set to Delegate, and Deactivate. — Active User Management Menu
Management menu for an active user allowing one to change their role or deactivate the user.

Then click the action for the role you want to give them. This will then bring up the confirmation form, which displays the user’s username and email address. Enter any reason, note, etc. in the provided text box, which will be logged into the action history of the project. Then click the “Set PREVIOUSROLE to NEWROLE” button to change their role, or “Abort” to cancel. The form is shown in the screenshot below.

Screenshot of the form to change the role of an active user. It includes the user's username and email address (blurred out), a text box for entering the reason or other notes, an Abort button, and a Set OLDROLE to NEWROLE button. — Change Role Form
Form for changing the role of an active user with a text box to enter the reason, notes, etc. and buttons to do the change or cancel.

Deactivating a User

Users can be deactivated, which

Removes them from the project’s Slurm account
Disables login with their project-specific username

The project-specific username and all the users’ files are retained for several months in case a PI or Delegate reactivates them.

To deactivate a user, click the icon (three vertical dots) on the right hand side of the user’s row in the table to get the menu containing the management options, just like for changing a user’s role. The icon will not appear if you lack the permissions to manage the user. Then click the “Deactivate” action. This will then bring up the confirmation form, displaying the user’s username and email address. Enter any reason, note, etc. in the provided text box which will be logged into the action history of the project. Then click the “Deactivate this member” button to deactivate them, or “Abort” to cancel. The form is shown in the screenshot below.

Screenshot of the form to deactivate an active user. It includes the user's username and email address (blurred out), a text box for entering the reason or other notes, an Abort button, and a Deactivate this member button. — Deactivate User Form
Form for deactivating an active user with a text box to enter the reason, notes, etc. and buttons to do the change or cancel.

Reactivating a User

To reactivate a deactivated user, go to the table of deactivated users and click the icon (three vertical dots) on the right hand side of the user’s row in the table to get the menu containing the management options, which are shown in the screenshot below. The icon will not appear if you lack the permissions to manage the user.

Screenshot of the user management menu for a deactivated user showing actions Reactivate. — Deactivated User Management Menu
Management menu for a deactivated user, allowing one to reactivate the user.

Then click the “Reactivate” action. This will then bring up the confirmation form, displaying the user’s username and email address. Enter any reason, note, etc. in the provided text box, which will be logged into the action history of the project. Then click the “Reactivate this member” button to reactivate them, or “Abort” to cancel. The form is shown in the screenshot below.

Screenshot of the form to reactivate an active user. It includes the user's username and email address (blurred out), a text box for entering the reason or other notes, an Abort button, and a Reactivate this member button. — Reactivate User Form
Form for reactivating an active user with a text box to enter the reason, notes, etc. and buttons to do the change or cancel.

Getting the Join Link

Every project has a join link that people can use to request to join the project. For non-open (“closed”) projects, only PIs and Delegates can get this link. To get it, click on the “Share join link for this project” button below the project description. A form, shown in the screenshot below, will pop up with the join link, a “Copy the link” button to copy it into your clipboard, and a “Close” button to close the form.

Screenshot of the form to get the join link for the project, which shows the join link, a Close button to close the form, and a Copy the link button to copy the link into your clipboard. — Join Link Form
Form for getting the join link, which includes a convenient button to quickly copy it to your clipboard.

Handling Join Requests

Pending join requests are shown in the table in the bottom box, “Non-verified users”. To approve or reject joining the project, click the icon (three vertical dots) on the right hand side of the user’s row in the table to get the menu containing the management options. The icon will not appear if you lack the permissions to manage the user. The only option is to handle the request, which must be clicked. This will then bring up the option to approve or reject, which will display their Academic Cloud username, the message they entered in their join request, and the date and time they made their join request. Enter any reason, note, etc. in the provided text box which will be logged into the action history of the project. The click “Reject this application” to reject the request or “Approve this application” to approve it. If approved, they will get the role of Member. The form is shown in the screenshot below.

Screenshot of the form to approve or reject a request by a visitor to join the project. It includes the user's Academic Cloud username (blurred out), a text box for entering the reason or other notes, a Reject this application button to reject the application to join, and a Approve this application button to approve the request to join. — Join Request Approval/Rejection Form
Form for approving or rejecting a request to join the project with a text box to enter the reason, notes, etc. and buttons to approve or reject.

Edit Project Name, Description, or Open/Closed

Users with the PI role may edit the name and description of a project or change whether it is open (visible to all users who can then request to join) or closed. Click the “Edit” button below the project description to bring up the page for editing the project, which is shown in the screenshot below. Text boxes are provided to change the project’s name and description. There is also a toggle to set whether the project is open or closed. It is also possible to add more users to the project. After making any desired changes, click the “Submit” button at the bottom.

Screenshot of the edit project form. The form includes a text box to change the project's name, a toggle switch to control whether it is open or closed, short project identifier box (grayed out since even PIs can't change it), a text box to change the description, a text box to add more users to the project, and a Submit button at the bottom to submit the changes. — Edit Project Form
Form for changing a project name or description, changing it to an open/closed project, and adding more members.

Applying for projects on the SCC

Each member/employee of the Georg-August-University of Göttingen and Max Planck society is eligible to request a project in the HPC Project Portal for access to the Scientific Compute Cluster (SCC), under any of the following conditions:

You are the head of a work group, institute or faculty responsible for a group of employees and/or PhD-students.
You are the primary investigator (PI) of a research project.
You are supervising a student writing their Bachelor’s or Master’s thesis using HPC resources.

Group leaders/heads of an institute or faculty can and should have a generic, long-running project to grant HPC access to their employees without the need for a specific project title, description or start-/end dates. These projects will be named “AG <Professorname> - <Work Group Name>” by convention. All other HPC projects for specific research projects, theses, etc. under the same responsible institute/group leader will be sub-projects of this generic project.

Ultimately, PIs of an existing HPC Portal project will be able to create sub-projects on their own. This feature, among other planned improvements is not implemented, yet.

Applying for an HPC Portal project (Research Group Project)

To apply for a project, please write an email to hpc-support@gwdg.de with a subject like “Request for HPC project on the SCC”, containing the following details:

Information	Description	Example
PI	Primary investigator of the project	Prof. Dr. Julian Kunkel
Organization	Georg August University, MPG, DPZ, etc.	Georg August University of Göttingen
Faculty	If organization is the university, the faculty your institute belongs to	Fakultät für Mathematik und Informatik
Institute	Your institute	Institut für Informatik
Work group name	If applicable	AG Kunkel - High-Performance Storage
Optional: One or more delegates	A delegate is able to perform most tasks in the Project Portal in place of the PI. It is highly recommended to appoint one of your trusted group members to perform the duty of managing project membership.

Applying for sub-projects (Generic Project)

If you have the above generic “AG Professorname” research group project already, is is also possible to apply for sub-projects. These have their own independent set of HPC accounts, since all Project Portal accounts are project-specific. These can for example be used by large research groups to partition their users into sub-groups, or to collaborate with externals on a particular subject.

If you are not a PI or delegate of the research group project, please put one of them in the CC of your email when you request such a sub-project. If your association with the research group is unclear, we will also have to confirm with them that you are allowed to use their HPC resources, which can delay the process.

Information	Description	Example
Responsible professor	Head of institute/work group/faculty	Prof. Dr. Julian Kunkel
Project title	Full title of your project
Project description	Short summary of the topic this project investigates, the goals you want to achieve, etc.
Start and end date	Dates determining from when to when the project will be active. After a project ends, its members will no longer be able to submit jobs. Login will remain possible for 90 days after the project ends, in order to download your data.
Optional: list of members’ email addresses	Once the project is created, you will be able to add and remove/disable members, but if you want to speed this up, you can include a list of initial project members with your application. Please list only official (for example @uni-goettingen.de, not Gmail or Protonmail, etc.) and valid email addresses, one per line.

Data Migration Guide

While having a separate username for each project has some upsides such as separate data store quotas, never having to worry about submitting jobs with the wrong Slurm account, etc.; a major downside is that sometimes files must be copied or moved between usernames. Common scenarios are:

Copying scripts or configuration files in your HOME directory that took effort to create (e.g. .bashrc, .config/emacs/init.el, etc.)
Moving files from your legacy username to a new project-specific username
Moving files from your username of an expired project to the new username of a successor project
“Graduating” from using the SCC to a full NHR project

In all cases, pay attention to which data stores are available on each island of the cluster. Data can only be transferred between two data stores from a node that can access both. See Cluster Storage Map for information on which data stores are available where. If there is no node that can access both, you might have to use an intermediate data store.

Note

This topic requires at least basic understanding of POSIX permissions and groups. Refer to the following links for further information:

https://en.wikipedia.org/wiki/File-system_permissions#Notation_of_traditional_Unix_permissions
https://en.wikipedia.org/wiki/Chmod
https://en.wikipedia.org/wiki/Unix_file_types#Representations

Or take a look at our self-paced tutorial for beginners.

Info

Only the username owning a file/directory can change its group, permissions or ACLs. Even other usernames attached to the same AcademicID are unable to, because the operating system does not know that the different usernames are aliases for the same person.

Warning

Many directories have both a logical path like /user/your_name/u12345 and a real path that points to the actual location on the filesystem.

Please always operate on the real paths which are directories you can actually modify, unlike the symbolic links below /user or /projects which cannot be modified by users.

You can find out the real path with the following command:

realpath /path/to/directory

Alternatively, as a quick way to transparently resolve logical paths, you can add a trailing / at the end of the path when using it in a command.

Also see our general tips on managing permissions. Especially, read the advanced commands section at the end, if any of the commands documented on this page are slow / take a long time to complete.

Strategy

In theory, you have two options for getting your data from the source to its destination:

“Push”: Grant write permission to the destination to your username owning the source data and copy/move the data when logged in as the source username.
“Pull”: Grant access to the source data to your username that owns the destination directory (or is a member of the destination project), then copy/move the data logged in as the destination username.

In practice, we strongly recommend to always use the “pull” method, and to not move, but copy the data. The reason is simple, regular (unprivileged) users are unable to change ownership of files/directories to another username. Copied files are owned by the user that created the copy, while moved files retain the original owner.

Using another strategy than pull+copy will result in your destination username being unable to change the group or permission of the migrated files/directories. This would be especially problematic when the old username eventually becomes inactive (due to legacy users being disabled or the old project expiring), you will no longer be able to login as that username.

Note

Hint: Legacy usernames, SCC as well as HLRN/NHR, will be disabled at the end of 2025/start of January 2026.

A good way to still effectively “move” your data, is to use rsync with the --remove-source-files option. This will delete each source file, directly after it was copied and the integrity of the copy verified. It replicates the original directory structure at the destination, the only downside is that it also leaves behind an empty “skeleton” of the directory structure at the source.

Tip

You can use something like

find <source_path> -type d -delete

to remove all nested empty directories. This will leave any directories that still had files in them intact, allowing you to be sure that everything has been safely moved and nothing was forgotten or failed to copy that you might otherwise delete afterwards.

The rest of this page will focus on setting up the correct permissions, so you are able to execute the above transfer as the (implied) last step.

Determine Method to Use

There are various methods to migrate data, depending on the owner as well as the source and destination data stores. In this table, you can find the easiest methods we recommend for each kind of migration, in descending order:

Source	Destination	Method
project-specific username	project-specific username with same AcademicID	Shared AcademicID Group
	AcademicID Username (legacy SCC)	Shared AcademicID Group
	your legacy HLRN/NHR username	Get Added to Shared AcademicID Group
	any other username	ACL
legacy SCC username	project-specific username with the same AcademicID	Shared AcademicID Group
	any other username	ACL
legacy HLRN/NHR username	your project-specific usernames	Get Added to legacy Group
	your project-specific usernames	Get Added to Shared AcademicID Group
	your legacy SCC username	Get Added to Shared AcademicID Group
	any other username	ACL
project	any other project	Between Projects

Note

The ACL method works on most data stores (some don’t support them) and is the most powerful, but also the most complex. Often, you could use it, but we recommend using the other methods when possible. The data stores ACLs do not work on are:

all ARCHIVE/PERM data stores
all Tape Archives

Get Added to Legacy Group

Legacy HLRN/NHR usernames have an accompanying POSIX group of the same name, just like legacy projects. Files and directories in the various data stores of your legacy username or project belong to these groups by default. This means in most cases, you can completely skip the step of changing the group or permissions as documented for the other methods on this page, since your new, project-specific u12345 username can just be added to the legacy group by our support team.

Use the groups command to list all groups that have your current username as a member, when logged in as your target username. If those include your legacy user/project group, your new username should already have access to the old one’s data. A notable exception are the “top-level” personal home or scratch directories, which by default are not read/write/executable by the group (while subdirectories often are).

For those, a quick and simple (non-recursive)

chmod g+rX <path>

is usually enough. For example, John Doe’s legacy HLRN username is nibjdoe and his project-specific username u12345 is a member of the group nibjdoe. John would only need to run

chmod g+rX /scratch-grete/usr/nibjdoe

to grant u12345 full read access. Use rwX for read/write access.

Get Added to AcademicID Group

Legacy HLRN/NHR usernames have different AcademicIDs than the AcademicID used for project-specific usernames in the HPC Project Portal and legacy SCC usernames. Thus, the Shared AcademicID Group method cannot be directly used. But, the legacy HLRN/NHR username can be added to your shared AcademicID POSIX group (HPC_u_<academicuser>, where academicuser is the username of the AcademicID) by our support team by writing a support ticket. Make sure to use the email address associated with the accounts to send the ticket, or one of them if different email addresses are associated with each. This proves you actually own the accounts in question (you may be asked for additional information to prove ownership if they are associated with different email addresses). Make sure to clearly state both your legacy HLRN/NHR username and the AcademicID whose POSIX group it should be added to. Once your legacy username has been added to the HPC_u_<academicuser> group, proceed to the Shared AcademicID Group method.

Using Shared AcademicID Group

Every AcademicID with at least one project-specific username in the HPC Project Portal has a shared POSIX group of the form HPC_u_<academicuser> (where academicuser is the username of the AcademicID). All of that AcademicID’s project-specific usernames as well as the primary username of the AcademicID itself are members of this group. For example, John Doe with AcademicID username jdoe is a member of two projects and has two project specific usernames u12345 and u56789, he will have a group called HPC_u_jdoe with 3 members jdoe, u12345, and u56789. This shared POSIX group is provided to facilitate easy data migration between usernames without the risk of giving access to others by accident.

To grant access to a file/directory to the other usernames in the shared AcademicID POSIX group, you would do the following as the username that owns the directories/files (your other usernames lack the permissions):

chgrp [OPTIONS] HPC_u_<academicuser> <path>
chmod [OPTIONS] g+<perms> <path>

If the <path> is a directory, you should generally add the -R option to make the command apply the group/permissions recursively to subdirectories and files. <perms> should be rX for readonly access and rwX for read-write access, where the capital X gives execute permissions only to files/directories that are already executable by the owner.

Warning

Please do NOT use a lower-case x in <perms> when recursively changing directory permissions! That would give execute permissions to all files, even those that should not be executable. Having random files be executable without good reason is confusing in the best case and a potential security risk and risk to your data in the worst.

Info

It is important to remember, your other usernames can’t access <dir>/<file> unless they can also access <dir>, so always be mindful of the parent directory/ies.

Since symlinks are used for many data stores, make sure to end <path> with a / when operating on directories or use $(realpath <path>) to get the fully resolved destination after walking through all symlinks. Otherwise, the commands will fail, trying to operate on the symlink and not the destination. For example, /user/jdoe/u12345 would be a symlink to u12345’s HOME directory, so if you wanted to share that with your other usernames in the same HPC_u_jdoe group, you would run one of the following examples:

chgrp -R HPC_u_jdoe /user/jdoe/u12345/

chgrp -R HPC_u_jdoe $(realpath /user/jdoe/u12345)

Or you could of course just use the real path in the first place.

To give the destination username read-only access to the source, do the following:

Login with the username of the source
Change the group of the source to HPC_u_<academicuser>
Add g+rX permissions to the source directory (recursively)
If you are sharing a subdirectory in your data store, you will need to change the group of the parent directory/directories and add the permission g+X (non-recursively)

For example, suppose John Doe wants to give access to the .config subdirectory of his HOME directory of his legacy SCC username to his other usernames so the configuration files can be copied over. John would do this by logging in with jdoe and running

[jdoe@gwdu101 ~]$ chgrp HPC_u_jdoe ~/
[jdoe@gwdu101 ~]$ chgrp -R HPC_u_jdoe ~/.config
[jdoe@gwdu101 ~]$ chmod g+rX ~/
[jdoe@gwdu101 ~]$ chmod -R g+rX ~/.config

Then, John could access the files from his u12345 username like

[scc_cool_project] u12345@gwdu101 ~ $ cp -R /usr/users/jdoe/.config/emacs ~/.config/

If John wants to keep using the shared directory to create new files with the source username but by default grant access to the other usernames in HPC_u_<academicid>, he could also set the SGID-bit on the shared directory, so any newly created files will also be owned by the correct group automatically:

find <path> -type d -exec chmod g+s {} \;

Tip

See the advanced commands section of the Managing Permissions page if you have a very large number of files/directories and the above commands are taking a long time.

Using ACLs

Data can be migrated using ACLs (Access Control Lists) on most data stores (some don’t support them), but it is more complex. ACLs can be more powerful than regular POSIX permissions, but are not immediately visible and can easily lead to confusion or mistakes. ACLs should be avoided unless you can’t use the easier Shared AcademicID Group or Get Added to Shared AcademicID Group methods.

The basic idea with ACLs is that you can give additional r/w/x permissions to specific users or groups without changing the file/directory owner, group, and main permissions. You can think of them as giving files/directories secondary users and groups with their own separate permissions.

Warning

You must use the username that owns the files/directories to add ACLs to them.

ACLs are added with the setfacl command like

setfacl [OPTIONS] -m <kind>:<name>:<perms> <path>

and removed like

setfacl [OPTIONS] -x <kind>:<name> <path>

where <kind> is u for a user and g for a group, <name> is the username or group name, and <perms> is the permissions. For <perms>; use r for read access, w for write access, and capital X to grant execute permissions to files/directories already executable by the owner. Add the -R option to apply the ACL recursively to subdirectories and files.

Warning

Please do NOT use a lower-case x in <perms> as that gives execute permissions to all files, even those that should not be executable. Having random files be executable without good reason is confusing in the best case and a potential security risk and risk to your data in the worst.

Info

It is important to remember, your other usernames can’t access <dir>/<file> unless they can also access <dir>, so always be mindful of the parent directory/ies.

Since symlinks are used for many data stores, make sure with directories to end <path> with a / or use $(realpath <path>) to get the fully resolved destination after walking through all symlinks. Otherwise, the commands will fail, trying to operate on the symlink and not the destination. For example, /user/jdoe/u12345 would be a symlink to u12345’s HOME directory, so if you wanted to share that with username u31913, you would either run

setfacl -m u:u31913:rX /user/jdoe/u12345/

setfacl -m u:u31913:rX $(realpath /user/jdoe/u12345)

You can see if a file/directory has ACLs when using ls -l by looking for + sign at the end of the permissions column. ACLs can be displayed using getfacl. The following example demonstrates making two files bar and baz in subdirectory foo, adding an ACL to bar, showing the permissions with ls -l, and then reading the ACLs on bar:

[gzadmfnord@glogin4 ~]$ mkdir foo
[gzadmfnord@glogin4 ~]$ cd foo
[gzadmfnord@glogin4 foo]$ touch bar baz
[gzadmfnord@glogin4 foo]$ setfacl -m u:fnordsi1:r bar
[gzadmfnord@glogin4 foo]$ ls -l
total 0
-rw-r-----+ 1 gzadmfnord gzadmfnord 0 May 21 15:56 bar
-rw-r-----  1 gzadmfnord gzadmfnord 0 May 21 15:56 baz
[gzadmfnord@glogin4 foo]$ getfacl bar
# file: bar
# owner: gzadmfnord
# group: gzadmfnord
user::rw-
user:fnordsi1:r--
group::r--
mask::r--
other::---

To give the destination username readonly access to the source, do the following

Login with the username of the source
Add g+rX ACLs to the source directory (recursively)
If you are sharing a subdirectory in you data store, you will need to add a g+X ACL to the parent directory/directories (non-recursively)

For example, suppose John Doe wants to give access to the .config subdirectory of his HOME directory of his legacy HLRN/NHR username nibjdoe to his project-specific username u12345 so the configuration files can be copied over. John would do this by logging in with nibjdoe and running

[nibjdoe@glogin3 ~]$ setfacl -m u:u12345:rX ~/
[nibjdoe@glogin3 ~]$ setfacl -R -m u:u12345:rX ~/.config

Then, John could access the files from his u12345 username like

[nib30193] u12345@gwdu101 ~ $ cp -R /mnt/vast-nhr/home/nibjdoe/.config/emacs ~/.config/

If John wants to keep using the shared directory to create new files with the source username but by default grant access to u12345, he could set a default ACL on shared directory, so any newly created files and directories will automatically have the same ACL

setfacl -R -d u:u12345:rX <path>

where the -d option is used to specify that the default ACL should be changed. If you want to remove a default ACL, you also need to include the -d option.

Between Projects

Have A Username in Both Projects

With your (legacy) username that is a member of both projects, you can just copy the data from one to the other as long as it isn’t too large with rsync or cp.

Warning

If the data is very large, this will be very slow and may harm filesystem performance for everyone. In this case, please write a support ticket so the best way to copy or move the data can be found (the admins have additional more efficient ways to transfer data in many cases).

Have Different Usernames in Both Projects

If the data is small, it can be transferred via an intermediate hop. If the source project is A and the destination project is B and your usernames in both are userA and userB respectively, this would be done by:

Copy/move the data from the project datastore of A to a user datastore of userA.
Share the data from the datastore of userA with username userB using the respective method in the table above.
Using username userB, copy the data from the datastore of userA to the destination datastore of project B.

Otherwise, please write a support ticket so a suitable way to migrate the data can be found.

Make sure to indicate the source and destination path, their projects, and your usernames in each.

NHR/HLRN Project Migration

Starting 2024/Q2, all NHR/HLRN new projects are created in the HPC Project Portal. Projects created before that continue to work except that all management must be done by support ticket (see Getting Started to get the right email address) until migrated to the HPC Project Portal. The pre-HPC-Project-Portal projects must migrate on their next renewal, or at an earlier data if the PI/s choose.

Info

Certain management operations such as adding new users to a project are not possible on non-migrated projects, which means that PI/s wanting to do them will need to trigger the migration process early.

Preparing for Migration

Migration is a multi-step process:

The PI/s must create Academic Cloud accounts if they do not already have them. In many cases, one already has an Academic Cloud account as part of the federated institutional login. Otherwise, create an account.
The PI/s must login to Academic Cloud at least once and get their usernames (go to the Account page) or provide the email address their Academic Cloud account is listed under.
(Optional) Any number of non-PI project members do steps 1 and 2 as well and collect their usernames or email addresses their accounts are listed under.
If migrating unprompted (before renewal), a member of the project (a PI or someone authorized by the PI/s) must start a support ticket requesing migration (see Getting Started to get the right email address). If the PI/s have been contacted about migrating, they must simply reply. In both cases, the usernames of the PI/s and any other members to migrate (or email addresses listed under their Academic Cloud) must be provided as well as to which user all existing files/directories should have their ownship changed to.
We will get confirmation from every PI who did not send the migration request or do the migration response. This is done to ensure all PIs are on board.
The PI/s (or who they designate) must make an appointment for the day the migration is to take place. This is important because the project must not be used during the migration process.
The migration takes place on the agreed upon day. Note that all the project’s Slurm jobs that have not finished before the migration begins will be cancelled. The migration will take up to 1 day to complete.

Changes During Migration

Migrating a project to the HPC Project Portal brings with it many changes which are summarized below:

Changes for The Project

The project undergoes the following changes during migration:

POSIX group changes from PROJECT to HPC_PROJECT. This is accompanied by a GID change.
Slurm account is changed:
1. Old Slurm account is deleted (all legacy users removed from it).
2. New Slurm account with the same name is created in the appropriate location in the project tree.
3. Remaining compute balance from the old Slurm account is set as the limit of the new account.
4. The project-specific usernames for the migrated users are added to the new Slurm account.
New data stores are created.
The owner and group of the old data stores are changed to the new POSIX group and the project-specific username of one of the PI/s or some other member the PI/s choose/s. Note that this means the legacy users can no longer access the project data.
Filesystem quotas on the old data stores are rebuilt.

Changes for The Project Members

The following changes happen to the members of the project during the migration:

The legacy NHR/HLRN username of the member is removed from the Slurm account of the project (can no longer submit jobs on the account).
The legacy NHR/HLRN username of the member loses access to the project data due to the change in project POSIX group.
If the PI/s are migrating the member, the member gets a new project-specific username in the new project using their Academic Cloud account which:
1. Gets a new HOME directory and other data stores.
2. Made a member of the new POSIX group of the project, thereby giving access to the project data.
3. Added to the project’s new Slurm account.

After the migration, the legacy NHR/HLRN username still works. Login will still possible and the data in the HOME directory and other data stores of the NHR/HLRN username will be preserved until 6 months after the last project the legacy NHR/HLRN username is in has ended or migrated. To migrate data from your legacy NHR/HLRN username to your new one, see the Data Migration Guide.

Project Structure

Project

Projects are the primary unit used to manage compute and storage resources. Projects are organized in a hierarchy with a small number of top-level projects which have sub-projects, which can have their own sub-projects, and so on. The project hierarchy is to properly organize resources and properly attribute compute and storage usage to the right funding source. Each project has the following:

Name (human readable)
Description
A POSIX group if the project has any users
A Slurm account
Some number of data store directories depending on the kind of project.
Zero or more users, each with their own role which is used to control their permissions.
Usually a parent project (the exception are the top-level projects)
Zero or more sub-projects
HPC Project ID, which is used as:
- The name for the project’s directories in the data stores.
- The name of the POSIX group with the prefix HPC_ (except for structural projects).
Project path, which is the location of the project in the hierarchy represented as a POSIX path of HPC Project IDs.

Each user in a project gets their own project-specific username just for the project, which currently take the form uXXXXX where each X is a base-10 digit. If a user is a member of more than one project, they have separate project-specific usernames (and thus HOME directories, other data stores, etc.) for each but share the same SSH key that is uploaded via Academic Cloud (see the Uploading SSH Keys page).

Having a separate username for each person a project is in has the following benefits:

Makes it easy for someone to be in projects of many different kinds (SCC, NHR, KISSKI, REACT, etc.) and even multiple of the same kind (e.g. a project and one of its sub-projects, or a course project and one’s thesis project, etc.).
Each project-specific username has only a single Slurm account, that of the project.
- Never need to pass -A ACCOUNT to sbatch and srun because the only account is the default one.
- Impossible to accidentally submit Slurm jobs to the wrong account. This helps PI/s and users avoid getting very unpleasant audits from funding institutions and agencies to correct accounting problems.
Each project-specific username gets their own HOME directory (and possibly additional directories) with their own quotas.
- Easy to have separate configurations for each project and not have to worry about them clashing (e.g. some code has a hard coded dependency on say ~/.config/PROGRAM.cfg and you need different versions for each project).
- Using up all of one’s quota for your HOME directory on one project doesn’t stop one’s work with another project.
- Accidentally deleting something in the HOME or other directory of the project-specific username on one project doesn’t impact ones files with other projects.

The main downside is that one has to keep track of one’s separate project-specific usernames for each project and sharing HOME config files requires copying them (see the Data Migration Guide for how to copy and/or move data between usernames). You can log into the HPC Project Portal to see all projects that you are in and your project-specific usernames in each (see the User Perspective page for more information).

Simple Example

A simple example project hierarchy is shown in the diagram below with each project as a yellow rectangle and their users as blue stadiums. Project A is a top-level project with sub-projects B and C. Projects A and B have one user each, and project C has two users. Person Foo Bar (their Academic Cloud ID is foo.bar) is a member of Project A (project-specific username u00001) and Project C (project-specific username u00103). Person Baz Aardvark (their Academic Cloud ID is baz.ardvark) is a member of Project B (project-specific username u00003). Person Bee Clam (their Academic Cloud ID is bee.clam) is a member of Project C (project-specific username u00006).

---
title: Simple Example Project Hierarchy
---
flowchart TB

    subgraph A
        direction LR
        userA1(["u00001 (foo.bar)"])
    end

    subgraph B
        direction LR
        userB1(["u00003 (baz.aardvark)"])
    end

    subgraph C
        direction LR
        userC1(["u00006 (bee.clam)"])
        userC2(["u00103 (foo.bar)"])
    end

    A --> B
    A --> C

If the HPC Project ID of Project B is b, then the POSIX group will be HPC_b (only projects at the top of the tree may have their POSIX group be exactly their HPC Project ID without prefix). Project B would then have its Project Map located at /projects/a/b if we assume the Project A’s HPC Project ID is a. Project B might have a WORK/SCRATCH directory at /mnt/lustre-grete/projects/b. Project-specific username u00003 could have its HOME directory at /home/baz.aardvark/u00003 and a WORK/SCRATCH directory at /mnt/lustre-grete/usr/u00003.

User Roles

Each user in a project has a role that determines what they are allowed to do. The roles are:

PI: For the project’s PI/s. Have full project management capabilities.
Delegate: Users that the PI/s have given limited project management capabilities to, namely to manage non-PI non-Delegate users. Particularly useful for very large projects and/or busy PI/s to help the PI/s manage the users in the project.
Member: Ordinary user with no management capabilities.

The initial role of every added user is Member. Every added user is given a project-specific username, made a member of the project’s POSIX group, and added to the project’s Slurm account. The access permissions of each role are given in the table below:

Role	Compute Access	File Access
PI	yes	yes (r/w)
Delegate	yes	yes (r/w)
Member	yes	yes (r/w)

The management capabilities/permissions of each role are given in the table below:

Role	Manage Users	Approve Join Requests	Change Open/Closed and Description
PI	yes	yes	yes
Delegate	only Members	yes	no
Member	no	no	no

Project Hierarchy

The exact project hierarchy at the levels above the projects each person is in is mostly an implementation detail. But in some cases it can be useful, particularly for understanding the billing and resource usage. The basic hierarchy is laid out below with the project paths and their descriptions.

scc – Parent project of all SCC projects.
- scc/ORGCODE – Subproject for a particular organization using the SCC. Example ORGCODE are MPG for Max-Planck-Gesellschaft, UGOE for Georg-August-Universität, UMG for Universitätsmedizin Göttingen, etc.
  - scc/UGOE/FACULTYCODE – Subproject for a faculty at Georg-August-Universität.
    - scc/UGOE/FACULTYCODE/INSTITUTECODE – Subproject for an institute at Georg-August-Universität.
      - scc/UGOE/FACULTYCODE/INSTITUTECODE/* – Subprojects for the work groups, etc. in the institute at Georg-August-Universität.
  - scc/ORGCODE/INSTITUTECODE – Subproject for an institute in the particular organization. Examples would be particular Max-Planck institutes, kliniks and institutes at Universitätsmedizin Göttingen, etc.
    - scc/ORGCODE/INSTITUTECODE/* – Subprojects for the work groups, etc. in the institute.
extern – Parent project of all project groups external to the traditional SCC or with their own external funding.
- extern/nhr – Parent project for all NHR projects.
  - extern/nhr/nhr_STATECODE – Parent project for all NHR projects from a particular federal state (for example, Niedersachsen is extern/nhr/nhr_ni)
    - extern/nhr/nhr_STATECODE/nhr_STATECODE_test – Project for all NHR test accounts in the particular federal state.
    - extern/nhr/nhr_STATECODE/PROJECTNAME – Individual full NHR projects in the particular federal state.
- extern/kisski – Parent project for all KISSKI projects.
  - extern/kisski/* – Individual KISSKI projects.
- extern/* – Parent projects for other externally funded projects.
intern – Parent project for internal, training, etc. projects
- intern/gwdg_training – Parent project for training courses (including the GWDG Academy).

Export Control

High Performance Computing is a dual-use good

Supercomputers are defined as dual-use goods, as they can be used for both civil and military purposes. Their export and any “technical assistance” through their use are principally subject to export control laws, specifically the German foreign trade law (Deutschen Außenwirtschaftsrechts) and the EU regulation No. 2021/821 (that governs the export control of dual-use goods).

In addition, there are embargo sanctions against specific countries or persons.

This legal framework may result in restrictions on the use of supercomputers in relation to certain countries or groups of people.

Providing access to the use of computing resources may be considered prohibited or subject to authorization under export control law if the use is specifically related to military use.

Our joint legal obligation is to prevent misuse that would cause significant harm to human dignity, life, health, freedom, property, the environment or peaceful coexistence. Any researcher and associate are obliged to contribute in order to implement the legal requirements.

Call to action for all users of GWDG HPC resources

To further improve export control regulation, changes will be implemented in the HPC project portal [1]:

Each user must enter her/his nationality in the project portal. It is recorded in the account settings.
Each PI must classify all her/his projects in either “civil”, “dual-use”, or “military” based on the resulting scientific work.
- This is necessary for all existing as well as upcoming projects.
- Please check if there are any indications in your ongoing and planned research projects that your research results could produce knowledge, products, or technologies that could be immediately misused for significant harmful purposes (Dual Use Research of Concern, DURC).
In addition to the already existing project description, the PI must explain the project classification (civilian, dual-use, military use).

We understand that these changes are inconvenient. These actions are necessary in order to meet our legal obligations. We aim to be as accessible as possible and enable external partners to the extent permitted by the sanctions.

Project classification

Each computing projects must be classified as follows regarding their civil and/or military use:

Civil
- Scientific work that produces knowledge, products or technologies that have exclusively civilian applications.
Dual-use
- Scientific work that has the potential to generate knowledge, products or technologies that could be directly misused by third parties to cause significant harm to human dignity, life, health, freedom, property, the environment or peaceful coexistence (so-called dual-use research of concern).
Military
- Scientific work that produces knowledge, products or technologies that are specifically related to weapons of mass destruction (NBC weapons) and their delivery systems, weapons of war, armaments, missile technology, and end uses relevant to armaments.

Effects for usage

Depending on home country and project classification the following computing restrictions might apply.

Check if the user’s home country is on embargo list (see the sanctionmap).

if country without embargo
- no restrictions
- project classification has no influence
if country with embargo
- project classification “civil”
  - user has limited computing access
- project classification “dual-use”
  - user is denied computing access
- classification “military”
  - user is denied computing access

If the home country of the PI is under embargo, the embargo consequence is enforced on the entire project.

if classification “civil”
- computing access with respective restrictions for all project members
if classification “dual-use”
- denied computing access for all project members
if classification “military”
- denied computing access for all project members

If the home country of a project member is under embargo, this user cannot be changed to the role Delegate.

The restrictions are as follows:

Users from China are not allowed to have GPU access.
Users from embargo countries (other than China) have limited computing access of maximal 300.000 core hours.

If the computing project is affected by restrictions the PI and the Delegate can clear this block in the project portal by giving an explanation.

The PI of the project is responsible for the compliance of her/ his project members with these terms and for the truthfulness of their statements (see also Terms of Use).

When changes to the sanctions map occur due to political decisions and countries are now under an embargo, or if embargoes are withdrawn, we will update the respective database in the project portal. This update has an immediate effect on the computing access and potential limitations.

For further information please contact your export control position at your institution. For employees of the Georg August University of Göttingen it is the Law and Foundation Department.

Changes in the project portal

Changes will be implemented during 2025 successively.

In the user account settings the user’s citizenship is recorded.
If applicable, restrictions will be recorded in the respective computing project and, in order to protect privacy data, will be visible only to the PI, the Delegate, and the affected user.
The section “Unclassified projects” provides a list with all projects that are not yet classified by the PI.
New members can only be added to computing projects when the project classification is set.

Support

As this is a sensitive topic, our support team will be on hand to address your questions and provide you with the best possible assistance. To contact them please write an email to hpc-support@gwdg.de.

FAQs

Please note: This is no legal advice. We reject any legal liability.

I need support, who can I contact?

For any concern, please write an email to our support team (hpc-support@gwdg.de). They will be on hand to address your questions and provide you with the best possible assistance.

How can I find out which countries are affected?

Please check the sanction map https://sanctionsmap.eu/#/main to learn which countries are affected.

Do I really have to do that?

Yes. Any researcher and associate are obliged to contribute in order to implement the legal requirements. Our joint legal obligation is to prevent misuse that would cause significant harm to human dignity, life, health, freedom, property, the environment or peaceful coexistence.

The project classification is not only necessary to get access to high performance computing resources but also when you are in an international exchange during your resarch activities. In a dual-use project even advising people can be considered as giving technical assistance and therefore might be prevented.

What do I need to do?

Visit the project portal.

Each user must enter her/his nationality in the project portal.
Each PI must classify all her/his projects in the project portal in either “civil”, “dual-use”, or “military” based on the resulting scientific work.
In addition to the already existing project description, the PI msut explain the project classification (civilian, dual-use, military use).

My project is already running. What do I have to do?

The same regulations apply to new and ongoing projects. See “What do I need to do?” for the required actions.

What do I need to consider as a PI when adding persons to my computing project?

Check if the home country of the person you want to add to your computing project is on the embargo list (see the sanctionmap).

If it is a country with no embargo, the user will be added, there will be no computing restrictions. The project classification has no influence.

If it is a country with an embargo, the project classification is considered. For civil projects the user will be added but has limited computing access. These limitations can be canceled by the PI in the project portal. For dual-use and military projects the user will be added to the project but has no computing access. This block can be canceled by the PI in the project portal.

How can I unban restrictions?

Only the PI and the Delegate can unban restrictions of project members in the project portal by giving an explanation. Keep in mind that you take the responsibility. The PI and Delegate cannot unban themselves. For this write a support ticket (hpc-support@gwdg.de).

I want more information. Where can I find them?

Deutsche Forschungsgemeinschaft e.V. (German Research Foundation)
- Dealing with Risks in Security-Relevant Research
Federal Office for Economic Affairs and Export Control (Bundesamt für Wirtschaft und Ausfuhrkontrolle)
- introductio to export control (in german)
- Embargos - Länder (in german)
- Güterlisten (in german)
- Export Control
- Export Control and Academia
Financial sanctions list (Finanzsanktionsliste) for identification of persons, groups, and organizations subject to comprehensive restrictions on the disposal of assets due to sanctions
- https://www.finanz-sanktionsliste.de/fisalis/
European Commission
- EU cybersecurity policies

What are exemplary research projects and their classification in civil, dual-use, and military?

Exemplary civil research projects

* Medicine and Biology * Basic research leading to the development of vaccines against diseases such as measles, polio, or malaria to protect and improve public health. * Research into the causes and treatment of neurodegenerative diseases such as Alzheimer's or Parkinson's that is focused on improving patients' quality of life. * Analyzing microorganisms in freshwater lakes or oceans helps to monitor the health of ecosystems and ensure water quality. * Social sciences and humanities * Studying historical city maps and analyzing how cities have changed over the centuries can contribute to the preservation of cultural heritage and civil urban planning. * Research of rare or extinct languages to preserve cultural heritage and document human diversity. * Studies on optimizing learning methods, early childhood education, or the integration of students with disabilities to improve the education system. * Engineering * The development of smart prostheses and exoskeletons that enable people with physical disabilities to walk again or grasp objects to improve the quality of life for those affected. * The development of more efficient solar cells, wind turbines, and energy storage systems to provide a sustainable energy supply for households and industry.

Exemplary dual-use research projects

* Life Sciences * In 2012, it became known that researchers had manipulated the H5N1 avian influenza virus so that it could be transmitted through the air in ferrets. The aim of the research was to better understand how the virus could mutate in order to predict a possible pandemic. However, the result, a genetically modified, transmissible virus variant, carries the risk of being misused as a biological weapon. * Artificial intelligence (AI) * An algorithm that serves civilian purposes, such as detecting tumors in X-ray images or navigating autonomous vehicles, can also be used in military contexts. For example, the same algorithm could be trained to identify strategic targets, specific vehicles, or individuals in satellite images. * Research aimed at automatically finding vulnerabilities in software in order to improve cybersecurity can also be exploited by cybercriminals or state actors. Such a tool could be used to prepare attacks on critical infrastructure. * Chemistry and material sciences * High-performance carbon fibers are used in the civilian sector in the manufacture of sports equipment and lightweight aircraft components, and in wind energy to stabilize turbine blades. However, their high strength and low weight also make them ideal for the manufacture of rocket casings and military drones. * Research into new catalytic nanomaterials can dramatically improve the efficiency of industrial processes such as petrochemicals. However, the same findings could also be used to optimize the production of fuels for military purposes or for the synthesis of chemical warfare agents. * Digital Humanities * Research that analyzes historical documents using AI methods to identify social or political patterns can provide valuable insights into the past. However, this knowledge and these algorithms could also be used by intelligence agencies to monitor vast amounts of data from social media or communication networks and predict behavior patterns. * Engineering * Computer-controlled (CNC) machine tools are used in the civilian production of automotive parts, medical devices, and aircraft components. However, their precision also makes them ideal for manufacturing rocket parts, submarine components, and other components for military equipment. * Small drones can be used by hobbyists for photography or by farmers to monitor fields. With a few adjustments, they can be used for military reconnaissance purposes or as carriers for small explosive devices. * Medicine and biology * CRISPR enables scientists to precisely modify genes in order to cure diseases or make plants more resistant. However, the same technology could also be misused to make pathogens such as viruses more deadly or transmissible, or to modify human genes for non-therapeutic purposes. * Psychology * Research into how rumors or disinformation spread on social networks can be used to develop strategies against fake news and promote public health. However, the same findings could also be used to create targeted propaganda campaigns to destabilize societies or for psychological warfare. * Agriculture sciences * The development of genetically modified crops that are resistant to pests and deliver higher yields can facilitate global food security. However, knowledge about the genetic weaknesses of crops could also be used to develop specific biological weapons against a country's food supply. * Forestry * Technologies that use satellite or drone imagery to monitor the health of forests are essential for climate protection and sustainable forestry. This data could also be used for military purposes to identify camouflaged positions, troop movements, or secret production sites in forested areas. * Chemistry * Research into new nanoparticles could lead to innovative medical treatments (e.g., targeted cancer therapy) or more efficient catalysts for energy production. However, the same materials could also serve as carriers for chemical warfare agents or be used in camouflage technology to produce invisible coatings for military vehicles. * Earth sciences * Monitoring earthquakes and volcanic activity helps to detect natural disasters early on and protect civilians. However, the same seismic measuring devices are also capable of registering the detonation of underground nuclear tests, thus helping to monitor international arms control agreements or serving as a tool for armament. * Mathematics and computer science * Basic research in cryptography serves to ensure secure communication and transactions on the Internet (e.g., online banking). However, the knowledge gained from this research can also be used to develop unbreakable encryption that could be used by terrorist organizations or criminal networks for their communications. * Mathematical models for analyzing social networks can predict the spread of disease in a population or optimize marketing strategies. The same methodology could be used by intelligence agencies to map terrorist networks or find vulnerabilities in an adversary's communication structures. * Law and economics * Analyzing legal gray areas and developing standards for cyber warfare should strengthen international stability. However, the knowledge gained about these gaps can also be used by states or hacker groups to carry out cyber attacks that are difficult to prosecute legally. * Research into global trade flows and supply chains helps to strengthen the resilience of the economy in times of crisis. However, knowledge about critical bottlenecks and dependencies can also be used militarily to deliberately weaken the economy of an enemy state. * Humanities and cultural studies * Studying how mass manipulation worked in the Third Reich or other totalitarian regimes helps us to recognize and combat totalitarian tendencies in the present day. However, the knowledge gained about the psychological mechanisms of mass communication could also be used by modern autocracies to influence public opinion. * Research on social and cultural dynamics in crisis areas helps peace researchers to better understand and resolve conflicts. The same insights are invaluable to the military in understanding the local population and planning operations in foreign cultures.

Exemplary military research projects

* The development of missiles that fly at more than five times the speed of sound and are maneuverable is primarily aimed at evading enemy defense systems. Research in this area focuses on aerodynamics, new propulsion systems, and heat-resistant materials designed exclusively for this purpose. * Research into materials and designs that minimize the radar signature of aircraft, ships, or vehicles to make military units invisible to enemy sensors. * Research into systems that generate a strong electromagnetic pulse to disable electronic devices in a specific area is used exclusively for warfare, as it aims to destroy the enemy's communications, power grids, and control systems. * The development of robots or drone swarms capable of independently identifying, tracking, and attacking targets. * Research into the development of drugs or technologies that increase the alertness, endurance, or stress resistance of soldiers in combat situations has a clear military purpose. This ranges from research into substances that reduce fatigue to prosthetics that exceed the functionality of natural limbs. * While civilian Earth observation satellites are used for weather forecasting or climate research, military reconnaissance satellites are specifically designed for espionage. Their research focuses on extremely high-resolution optical or radar systems capable of detecting military installations and troop movements on the ground. * Research into sonar systems and acoustic sensors for detecting and tracking submarines. Although sonar is used in civilian shipping, military applications are much more specialized and designed to detect enemy vessels.

Training

The trainings are all managed under the GWDG Academy.

See the following sub-pages for workshops and community events:

Explore our introduction videos to get started and make the most of our powerful computing resources.

Community Events

Become a part of our vibrant HPC community by joining our community events:

GöHPCoffee, our informal meeting between HPC users and staff
GöHAT GöHPC Admin Tea, an informal gathering of administrators from different data centers
GöAID AI Developers, our informal collaborative discourse centered around machine learning and deep learning
Chat AI Community Meeting, discuss our AI projects, governance, and collaboration opportunities for developers, designers, and users
Our Data Management meetings happen every third Tuesday at 3pm, to discuss best practices, challenges, and innovations in handling, storing, and processing scientific data efficiently
Our Monthly Storage Talks provide a forum for experts, researchers, and students to share best practices, tackle parallel I/O challenges, and develop essential HPC skills.
Our DHXpresso presents a forum for experts and researchers in the field of Digital Humanities. The first event will occur online on the 11th of July from 11 to 12 and requires a registration. Subsequent events will be monthly without registration.

All our meetings are open to anyone, regardless of your expertise in HPC. Please check out the respective event pages linked above for more details on each meeting series.

NHR Roadshow

We’re excited to invite you to a workshop, where we’ll introduce the NHR Network for Computational Life Science|Digital Humanities and the vast array of services available in Göttingen. These services can complement your local compute resources. Whether you’re a researcher, academic, or student, this session will empower you to leverage these powerful computing resources and expert support for your work.

How can you get a Roadshow at your place

If this piques your interest, please write a message to Anja Gerbes via our support channels. We are always happy to get together with you an plan a roadshow for your local community

Agenda Overview:

Introduction to the NHR Alliance: Learn about this key initiative and how it supports high-performance computing across Germany.
Presentation on NHR-Nord@Göttingen: Explore the services provided by Göttingen’s computing center.
Comprehensive Service Offerings: Discover the extensive services available to researchers, including tailored advice, training, and specialized computing resources.
Explore Science Domains: Gain insights into the wide range of disciplines supported by the NHR centers.
KISSKI AI Service Center: Learn about the cutting-edge infrastructure for sensitive and critical applications.
Access to Computing Clusters: Get the details on how you can use these systems for your research.
Wrap-up: Key takeaways and resources for getting started.

What is NHR? The National High-Performance Computing (NHR) program is an integral part of Germany’s HPC infrastructure, classified as a Tier 2 center. With nine NHR centers across Germany, NHR provides free access to high-performance computing resources and expert support for universities nationwide.

What We Offer:

Access to state-of-the-art computing resources across multiple domains including Life Sciences, Digital Humanities, AI, and more.
Support Services such as training, consulting, and funding opportunities for emerging scientists.
HPC Infrastructure and a robust scientific computing software stack tailored to a wide variety of research fields.
AI-as-a-Service through KISSKI and SAIA, offering scalable solutions and models, including large language models (LLMs), GPUs, and more.

How Can You Access These Resources?

For University Staff: Get easy access to Tier 3 (local university clusters) or Tier 2 (NHR systems like Emmy and Grete) with support for research, courses, and external collaborations.
KISSKI AI Service Center: Focuses on secure, flexible AI training and inference. Join now to take advantage of free access.

GWDG Academy

These are the HPC and AI related courses hosted at the GWDG Academy.

Training portfolio of the GWDG Academy as PDF with clickable links

Color legend

Beginner: Bright blue #98D4F2
Intermediate: Pastel blue #8080FF
Advanced: Cerise (red magenta mixture) #DE3163

Full list of courses

HPC Beginners Courses

These course are the introduction point to the HPC cluster. If you need to find out what to do with HPC, go to these courses.

They are offered very frequently and are designed to be beginner friendly.

Getting started with Linux Bash

Content

Are you afraid of the black box, the one with the $ sign in the front, the terminal? Become acquainted with the Linux Bash in order to harness the megawatts of power the HPC offers.

Want power? We Bash!

Using Linux and in particular the Bash is new for many first-time HPC users. This course is specifically designed to make the transition from laptop to Data Center easy. We will slowly and gently get you used to running a Bash, navigating the file tree to find folders and edit files. Most importantly, we will show you how to use the SSH protocol to access the front-end servers of our HPC cluster.

Note on the language of the course: We prepare the slides for the course in English, but we can also present in German if all participants understand German.

Learning goal

Using a Bash for work and improve your efficiency
Exploring and understanding the Linux file tree
Navigating folders and editing files without IDE
Using SSH for remote login into the HPC front-end servers

Skills

Trainer

Next appointment

Date	Link
05.06.2025	https://academy.gwdg.de/p/event.xhtml?id=67331a965d441669671bc61e
05.11.2025	https://academy.gwdg.de/p/event.xhtml?id=682643dc298a9177e714d870

How to KISSKI

Content

Using the KISSKI services can be a bit overwhelming without knowledge about how to use HPC resources. We will teach how to get onto the front end systems and start batch jobs using SLURM.

Additionally, we will work on using other resources such as the chat-ai services and jupyter hub to get you acquainted with running these services.

It is highly suggested to have experience with the Bash. A course covering Linux basics is offered before this course. You will be able to use the resources, but optimal utilization is currently only achievable using a Bash.

Note on the language of the course: We prepare the slides for the course in English, but we can also present in German if all participants understand German.

Note This course is renamed to:

Getting Started with the AI Training Platform

Requirements

Completion of a Linux or Bash course is highly suggested
Some Python knowledge

Learning goal

Getting to the front-end servers
Using SLURM
Resource selection
Using the module system
Using non SLURM KISSKI resources

Skills

Trainer

Next appointment

Date	Link
05.06.2025	https://academy.gwdg.de/p/event.xhtml?id=67331b325d441669671bc61f
05.11.2025	https://academy.gwdg.de/p/event.xhtml?id=682644de298a9177e714d871

Supercomputing for Every Scientist

Content

For first-time users, the Linux operating system on the GWDG cluster represents a significant entry hurdle, as does preparing their programs and using the batch system. This course is intended to provide a smooth introduction to the topic. We start with connecting to the cluster and give an overview of the most important Linux commands. Compilation and installation of software is then covered. Finally, an outline of the efficient use of computing resources with the batch system is given.

The course shall help new users to get started interacting with the HPC cluster of the GWDG and shows optimal ways to carry out computations.

Note on the language of the course: We prepare the slides for the course in English, but we can also present in German if all participants understand German.

Learning goal

Architecture of the GWDG Cluster
Connecting with SSH using SSH keys
Linux shell basics
Using the module system/SPACK
Software compilation with cluster specifics
Using the scheduler/SLURM in interactive and batch mode
Partitions and resources
Working on the different file systems (home, scratch, tmp, local)

Skills

Trainer

Next appointment

Date	Link
25.04.2025	https://academy.gwdg.de/p/event.xhtml?id=6733148e5d441669671bc614
02.10.2025	https://academy.gwdg.de/p/event.xhtml?id=6826409f298a9177e714d868
17.12.2025	https://academy.gwdg.de/p/event.xhtml?id=68264a68298a9177e714d87e

Using the GWDG Scientific Compute Cluster

Content

The course shall help new users to get started interacting with the HPC cluster of the GWDG and shows optimal ways to carry out computations.

Note on the language of the course: We prepare the slides for the course in English, but we can also present in German if all participants understand German.

Requirements

For the practical exercises: GWDG account (preferable) or course account (available upon request)
Own notebook

Learning goal

Architecture of the GWDG Cluster
Connecting with SSH using SSH keys
Linux shell basics
Using the module system/SPACK
Software compilation with cluster specifics
Using the scheduler/SLURM in interactive and batch mode
Partitions and resources
Working on the different file systems (home, scratch, tmp, local)

Skills

Trainer

Next appointment

Date	Link
25.02.2025	https://academy.gwdg.de/p/event.xhtml?id=6731d1d65d441669671bc5ea
26.08.2025	https://academy.gwdg.de/p/event.xhtml?id=68261a55298a9177e714d85a

General HPC usage

These workshops contain information going beyond simply using the cluster. Feel free to dive deeper into the knowledge we offer.

Deep Dive into Containers

Content

Containers are increasingly used in HPC as they allow users to make their codes and programs they need to use more independent of the exact Linux distro, configuration, software, and software versions residing on a cluster, and not have competing software environments in their HOME directory fighting each other. Containers, their place in HPC, and their creation and use in a HPC context, specifically on the GWDG SCC and GWDG NHR clusters, will be described in detail. Furthermore, the GWDG container database and how to use it will be described, followed by the integration of containers with GWDG GitLab (making containers, tests, storage, etc.).

Requirements

Using the GWDG SCC and/or GWDG NHR clusters, including Slurm job submission
GWDG account for GWDG Gitlab
Either a GWDG account activated for HPC on the SCC, a GWDG NHR account, or a course account for GWDG SCC or GWDG NHR
Practical experience with the Linux command line, Linux shell scripting, and git.

Learning goal

Understand when to use containers for HPC
Be able to convert a local workflow on workstation/laptop into a container suitable for HPC systems
Be able to run containers on the GWDG HPC systems including getting data in and out of containers, getting MPI and CUDA to work, etc.
Use the GWDG container database
Use the GWDG GitLab for building containers, using containers for testing, and storing containers

Skills

Trainer

Dr. Freja Nordsiek

Next appointment

Date	Link
24.06.2025	https://academy.gwdg.de/p/event.xhtml?id=673463cf5d441669671bc639

Practical Course in High-Performance Computing

Content

This practical course comprises of a crash course on the basics of High-Performance Computing, which is delivered during a one-week block tutorial. Including hands-on exercises, it will cover theoretical knowledge regarding parallel computing, high-performance computing, supercomputers, and the development and performance analysis of parallel applications. Practical demonstrations will encourage you to utilize the GWDG cluster system to execute existing parallel applications, start developing your own parallel application using MPI and OpenMP, and to analyze the performance of these applications to ensure they run efficiently. On the first day of the tutorial, we will help you form groups of three to four people to work on the exercises and form a learning community. For students, we will present on the last day of the tutorial a group assignment that you will have to solve in pairs. Students should register via StudIP. If you are just interested to learn about parallel programming and don’t need credits, you can join the block tutorial part of the course and earn a certificate.

Further information …

Requirements

Programming experience in C++, C or Python
Parallel programming concepts
Linux

Learning goal

The students will be able to
Construct parallel processing schemes from sequential code using MPI and OpenMP
Justify performance expectations for code snippets
Sketch a typical cluster system and the execution of an application
Characterize the scalability of a parallel application based on observed performance numbers
Analyze the performance of a parallel application using performance analysis tools
Describe the development and executions models of MPI and OpenMP
Construct small parallel applications that demonstrate features of parallel applications
Demonstrate the usage of an HPC system to load existing software packages and to execute parallel applications and workflows
Demonstrate the application of software engineering concepts

Skills

Trainer

Next appointment

Date	Link
01.04.2025	https://academy.gwdg.de/p/event.xhtml?id=673311e75d441669671bc611

Secure HPC - Parallel Computing with Highest Security

Content

Encryption tools
Batch scripting
Secure data management

Requirements

Fundamental proficiency in Linux commands (e.g., cd, mkdir, …)
Initial exposure to SSH for remote access
First experience submitting Slurm jobs

Learning goal

Understanding the importance of Secure HPC to process workflows that involve sensitive data
Familiarize with Secure HPC main steps
Understanding the execution with an automatic script

Skills

Trainer

Trevor Khwam Tabougua

Next appointment

Date	Link
26.05.2025	https://academy.gwdg.de/p/event.xhtml?id=673317d55d441669671bc61b
26.11.2025	https://academy.gwdg.de/p/event.xhtml?id=6826482d298a9177e714d878

Using Jupyter Notebooks on HPC

Content

For many purposes running interactive computations are preferable. In the field of interactive computations Jupyter notebooks are often the first choice for many researchers. We host the Jupyterhub instance that allows to spawn notebooks on HPC nodes. In this course you will learn how to spawn your own notebook and customize it if necessary.

Requirements

GWDG account (preferable) or course account (available upon request)
Laptop (the course is online)
Essentials of using HPC
Basic Python skills

Learning goal

spawning general Jupyter notebooks
spawning customized Jupyter notebooks
running computations
preparing presentations/demonstrations using Jupyter notebooks

Skills

Trainer

Azat Khuziyakhmetov

Next appointment

Date	Link
06.02.2025	https://academy.gwdg.de/p/event.xhtml?id=6731d0145d441669671bc5e7
19.08.2025	https://academy.gwdg.de/p/event.xhtml?id=68261850298a9177e714d857

Advanced HPC Topics

These topics require the knowledge of using the HPC or the beginner courses. They cover advanced topics including System Administration.

High Performance Data Analytics: Big Data meets HPC

Content

Big Data Analytics problems are ubiquitous in scientific research, industrial production and business services. Developing and maintaining efficient tools for storing, processing and analysing Big Data in powerful supercomputers is necessary for discovering patterns and gaining insights for data-intensive topics including biomolecular science, global climate change, cancer research and cybersecurity among others. Big Data Analytics technology is developing tremendously. High Performance Computing (HPC) infrastructure used in processing and analysing Big Data is of great importance in scientific research.

In this course learners will be provided with essential knowledge in emerging tools for Data Analysis in HPC systems. We will investigate parallelization opportunities in standard examples of Big Data analytics. Learners will also acquire skills on how to manage and integrate data into parallel processing tools.

Targeted Audience: Researchers and students using the HPC system for data-intensive problems.

Curriculum:

Data Management and Integration:
- GWDG Data Pool for Scientific Research
- Data Lakes and Data Warehouse
Distributed Big Data Analytics Tools and Technology:
- Using Apache Spark in HPC Systems

Requirements

Introduction to SCC course (GWDG Academy) or General knowledge on Linux and HPC system
Data Management course (GWDG Academy)
Basic understanding of Linear Algebra
Basic programming skills in Python

Learning goal

Providing interested learners with essential knowledge on emerging tools for Data Analysis in HPC systems
Learners will also have an opportunity to work on their own data sets

Skills

Trainer

Dr. Jack Ogaja

Next appointment

Date	Link
22.05.2025	https://academy.gwdg.de/p/event.xhtml?id=673457e15d441669671bc637

Monitoring HPC Systems in the GWDG

Content

GWDG is offering different methods to do a job analysis in regards to compute perfomance/IO and more. Besides tools that have to be started exclusively like Vampyr, the infrastructure itself offers tools which collect data continuously on the compute nodes and can correlate it to jobs. This data is offered to the users via a frontend utilizing the software Grafana, a web-based visualization tool.

This course offers a general overview of monitoring in HPC in order to allow the participants to understand how the systems interact and how data is acquired. Furthermore it gives an introduction to the usage of Grafana to analyse the collected data of the users‘ own jobs.

Learning goal

Get an overview of monitoring in HPC at the GWDG (What is it? Why?)
Understanding what ProfiT-HPC and Grafana is and what is used for
Basic knowledge on Grafana usage (login, check jobs on the dashboard)

Skills

Trainer

Marcus Merz

Next appointment

Date	Link
13.11.2025	https://academy.gwdg.de/p/event.xhtml?id=68264725298a9177e714d874

Practical: High-Performance Computing System Administration

Content

This practical course focuses on aspects of system administration in the area of high-performance computing. The course takes the format of a one-week block-course with many presentations from various colleagues around the main topic. Furthermore, university students of this course will contribute to the presentations as they have worked on their own projects related to HPC system administration. The presentations will include hands-on exercises that are to be completed during the presentations. For those that have no access to the HPC system yet, trainings accounts will be given out on request.

Further information

Learning goal

Discuss theoretic facts related to networking, compute and storage resources Integrate cluster hardware consisting of multiple compute and storage nodes into a “supercomputer“ Configure system services that allow the efficient management of the cluster hardware and software including network services such as DHCP, DNS, NFS, IPMI, SSHD Install software and provide it to multiple users Compile end-user applications and execute it on multiple nodes Analyze system and application performance using benchmarks and tools Formulate security policies and good practice for administrators Apply tools for hardening the system such as firewalls and intrusion detection Describe and document the system configuration

Skills

Trainer

Next appointment

Date	Link
07.10.2024

Quantum Computing with Simulators on HPC

Content

In the era of noisy intermediate-scale quantum computers (NISQ), quantum computer (QC) simulation plays a vital role in exploring new algorithms, debugging quantum states, or quantifying noise effects. While a laptop is fully suitable for smaller circuits, simulating more Qubits, running large numbers of shots or circuit variations, and simulating noise often require resources on the high performance computing (HPC) scale. This course covers the usage of different QC simulators offered on our HPC systems from both the command line interface (CLI) and our JupyterHub service. It focuses on the advantages of running simulations on an HPC system, mainly scaling to wider and deeper circuits, running shots in parallel, and including noise models.

Requirements

Basic knowledge of implementing and running gate based quantum algorithms on a laptop.
For the practical exercises: GWDG account (preferable) or course account (available upon request), own laptop.

Learning goal

What QC simulators can do
How to choose a suitable QC simulator
Using simulators on HPC from the CLI and JupyterHub
Connecting simulators to QC frameworks
Scaling QC simulations

Skills

Trainer

Next appointment

Date	Link
29.04.2025	https://academy.gwdg.de/p/event.xhtml?id=67344b225d441669671bc632
28.10.2025	https://academy.gwdg.de/p/event.xhtml?id=682641fa298a9177e714d86b

System, User and Developer Perspectives on Parallel I/O

Content

Parallel file systems provide the storage backend of HPC clusters. Since their characteristics have a large impact on the I/O performance and in consequence on the runtime of compute jobs, it is important to understand them and how to use them efficiently. This applies for both the perspectives of a simple user as well as on developers creating their own codes. The course will give an overview of parallel file systems and parallel-IO. It also covers anti-patterns which result in reduced performance. Furthermore, examples for efficient parallel-IO will be given.

Learning goal

Overview of parallel file systems and object storage, storage-IO
Understanding of the performance and behavior of storage systems
Understanding of good practices for implementing parallel-IO

Skills

Trainer

Next appointment

Date	Link
15.05.2025

Performance Engineering

This topic covers performance engineering from basics to specific applications.

Introduction to Performance Engineering

Content

HPC systems are expensive and increasingly widely used. Many applications, whether under development or already in use, can be installed on these systems. In order to use the valuable resources offered by these systems, care must be taken, not only in development of efficient algorithms, but also in many hardware, programming and deploying aspects. An attempt will be made to discuss all important aspects, and those which are mainly relevant for HPC systems, such as load balancing and communication overhead, will be focussed upon in more detail

Overview of factors which influence performance:

Basics of parallel computer architecture and topology
Parallelization: current methods, Amdahl’s Law
Performance bottlenecks
Methods and tools for analysis
Examples (LIKWID, Vampir & Scalasca)

Possible tools:

Learning goal

The course is intended to give a general view of the most relevant aspects for efficient use of resources by applications on HPC systems.
The participants will learn basic terminology and will be referred to other sources for further indepth information and training.

Skills

Trainer

Dr. Jack Ogaja

Next appointment

Date	Link
26.09.2024

Parallel Programming with MPI

Content

The efficient use of modern parallel computers is based on the exploitation of parallelism at all levels: hardware, programming and algorithms. After a brief overview of basic concepts for parallel processing the course presents in detail the specific concepts and language features of the Message Passing Interface (MPI) for programming parallel applications. The most important parallelization constructs of MPI are explained and applied in hands on exercises. The parallelization of algorithms is demonstrated in simple examples, their implementation as MPI programs will be studied in practical exercises.

Contents:

Fundamentals of parallel processing (computer architectures and programming models)
Introduction to the Message Passing Interface (MPI)
The main language constructs of MPI-1 and MPI-2 (Point-to-point communication, Collective communication incl. synchronization, Parallel operations, Data Structures, Parallel I / O, Process management)
Demonstration and practical exercises with Fortran, C and Python source codes for all topics; Practice for the parallelization of sample programs; Analysis and optimization of parallel efficiency

Requirements

Using the GWDG Scientific Compute Cluster - An Introduction, or equivalent knowledge
Practical experience with Fortran , C or Python
For the practical exercises: GWDG account (preferable) or course account (available upon request), own notebook

Learning goal

Use of MPI for parallelization of algorithms in order to be able to run parallel calculations on several computing nodes.

Skills

Trainer

Prof. Dr. Oswald Haan

Next appointment

Date	Link
06.05.2025	https://academy.gwdg.de/p/event.xhtml?id=673315795d441669671bc616

Performance Analysis of AI and HPC Workloads

Content

With the increasing adoption of AI technologies, evaluating computational performance of AI applications in HPC systems has become critical for results improvement. In this course we shall use HPC performance tools to profile and evaluate performance bottlenecks in deep neural networks and learn tips for efficient training and deployment of deep learning models in HPC systems. Tools to be covered in this course include: Nvidia Nsight System, Score-P and Vampir.

Requirements

Practical knowledge of deep learning frameworks, Tensorflow or Pytorch
Programming skills in Python
Knowledge in Linux

Learning goal

The course is intended to equip participants with fundamental knowledge on how to efficiently use HPC systems to run AI applications.

Skills

Trainer

Next appointment

Date	Link
21.05.2025	https://academy.gwdg.de/p/event.xhtml?id=673307bf5d441669671bc60f
01.10.2025	https://academy.gwdg.de/p/event.xhtml?id=68263fc9298a9177e714d867

Artificial Intelligence / GPU Programming

These are courses that are related to AI or broadly touch the topic. This also includes NVIDIA CUDA courses.

Most of these courses are funded by KISSKI.

Deep Learning Bootcamp: Building and Deploying AI Models

Content

This bootcamp is designed to provide an introduction to deep learning. The course will cover the process of building deep learning models using popular frameworks like TensorFlow and PyTorch. Additionally, participants will be introduced to the basics of deploying deep learning models, including the use of web interfaces for model interaction. The course will also include practical exercises where participants will apply their learning to build and deploy a simple AI model.

Requirements

Basic understanding of machine learning concepts
Familiarity with Python

Learning goal

Deep Learning Fundamentals: Understanding the core concepts of neural networks
Model Building with TensorFlow and PyTorch
Deployment of AI Models

Skills

Trainer

Jaison Lewis

Next appointment

Date	Link
25.02.2025	https://academy.gwdg.de/p/event.xhtml?id=67331cc25d441669671bc621
25.06.2025	https://academy.gwdg.de/p/event.xhtml?id=67331cc25d441669671bc621
16.09.2025	https://academy.gwdg.de/p/event.xhtml?id=68263ed1298a9177e714d864
09.12.2025	https://academy.gwdg.de/p/event.xhtml?id=682649ca298a9177e714d87c

Deep Learning with GPU Cores

Content

Graphic processors enable efficient deep learning on a massive scale. This course will explore how to use deep learning frameworks with GPU cores. Each participant gets hands-on experience with an example workflow. As the GWDG will offer several high-end GPU nodes, this course will help our attendees to get the most out of our new hardware. Moreover, theoretical concepts such as hardware architecture will be covered.

Requirements

Participants should have a basic knowledge of Linux and experience in one programming language (preferably Python).
In addition, basic deep-learning understanding is recommended.

Learning goal

After attending the course, participants

know basic concepts of GPU cores and when to use them
can migrate their deep learning workflow to the GPU
have an overview of hardware accelerators available on our clusters

Skills

Trainer

Next appointment

Date	Link
13.02.2025	https://academy.gwdg.de/p/event.xhtml?id=67344a635d441669671bc631
18.09.2025	https://academy.gwdg.de/p/event.xhtml?id=68263caa298a9177e714d860

Effectively Utilize AI Tools in Research

Info

Das ist ein englischsprachiger Kurs, der an Wissenschaftler*innen gerichtet ist. Ein deutschsprachiger Kurs, der Mitarbeiter*innen der Verwaltung anspricht, heißt: KI in der Verwaltung Eine Einführung in die Nutzung für alle Mitarbeiter*innen

Content

This course is designed for researchers and scientists from the University of Göttingen, the University Medical Center Göttingen (UMG), and the Max Planck Society who are interested in enhancing their research capabilities through the application of artificial intelligence (AI). Participants will explore how AI can assist in analyzing large datasets, automating routine tasks, and improving literature research and organization. The course also addresses the legal and ethical considerations surrounding the use of AI in research, ensuring that participants are equipped to use these tools responsibly and in compliance with relevant standards.

Learning goal

Gain an understanding of how AI can support and enhance research efforts
Develop practical skills in using AI tools for data analysis and literature research
Learn about the legal and ethical frameworks governing the use of AI in research
Explore specific use cases of AI-enhanced research processes
Master efficient prompting techniques for AI tools and strategies for integrating these tools into research workflows

Skills

Trainer

Next appointment

Date	Link
27.06.2025	https://academy.gwdg.de/p/event.xhtml?id=673307715d441669671bc60e
15.09.2025	https://academy.gwdg.de/p/event.xhtml?id=68264957298a9177e714d87b
04.12.2025	https://academy.gwdg.de/p/event.xhtml?id=68263e63298a9177e714d863

Fundamentals of Accelerated Computing with CUDA Python

Content

This workshop teaches you the fundamental tools and techniques for running GPU-accelerated Python applications using CUDA® GPUs and the Numba compiler. You’ll work though dozens of hands-on coding exercises and, at the end of the training, implement a new workflow to accelerate a fully functional linear algebra program originally designed for CPUs, observing impressive performance gains. After the workshop ends, you’ll have additional resources to help you create new GPUaccelerated applications on your own.

Further course information

Note: This course is run on external system resources, for which an account must be created with NVIDIA by the GWDG. Please note the data protection information for external courses.

Requirements

Basic Python competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations. NumPy competency, including the use of ndarrays and ufuncs. No previous knowledge of CUDA programming is required.
Desktop or laptop computer capable of running the latest version of Chrome or Firefox. Each participant will be provided with dedicated access to a fully configured, GPU-accelerated workstation in the cloud.
Further information for course preparation

Learning goal

At the conclusion of the workshop, you’ll have an understanding of the fundamental tools and techniques for GPU-accelerated Python applications with CUDA and Numba:

GPU-accelerate NumPy ufuncs with a few lines of code
Configure code parallelization using the CUDA thread hierarchy
Write custom CUDA device kernels for maximum performance and flexibility
Use memory coalescing and on-device shared memory to increase CUDA kernel bandwidth

Skills

Trainer

Tino Meisel

Next appointment

Date	Link
27.05.2024

GPU Programming with CUDA

Content

Graphic processors with massive parallelism (GPU) are increasingly used as computational accelerators suitable for highly parallel applications. The latest upgrade of the GWDG Compute Cluster for High Performance Computing has added nodes with a total of 50 GPU accelerator cards. CUDA is a widely used programming environment for GPUs. This course introduces hardware and parallelization concepts for GPUs. The CUDA programming environment is described in detail, both for C and Fortran, including the language elements for controlling the processor parallelism and for accessing the various levels of memory. The use of GPU accelerated libraries (cuBLAS, cuFFT) is demonstrated. All topics are explained by means of examples in practical exercises.

Requirements

Using the GWDG Scientific Compute Cluster - An Introduction, or equivalent knowledge
Practical experience with C
For the practical exercises: GWDG account (preferable) or course account (available upon request), own notebook

Learning goal

Use of CUDA to optimize algorithms on GPU

Skills

Trainer

Prof. Dr. Oswald Haan

Next appointment

Date	Link
13.05.2025	https://academy.gwdg.de/p/event.xhtml?id=673315ca5d441669671bc617

KI in der Verwaltung: Eine Einführung in die Nutzung für alle Mitarbeiter*innen

Info

This is a German course targeted to administration staff. An English version targeted to scientists is called: Effectively Utilize AI Tools in Research.

Content

Dieser Kurs richtet sich an alle Verwaltungsmitarbeiter*innen der Universität Göttingen, der Universitätsmedizin Göttingen (UMG) und der Max-Planck-Gesellschaft, die daran interessiert sind, die Effizienz ihrer Arbeitsabläufe durch den Einsatz von Künstlicher Intelligenz (KI) zu steigern. Insbesondere sind damit Mitarbeiter*innen in der zentralen und dezentralen Verwaltung, in den Sekretariaten und Antragsberatungen, Anfänger*innen und aktive Nutzer*innen gemeint, die ihre bestehenden Kenntnisse erweitern wollen. Der Kurs vermittelt grundlegende Konzepte und Anwendungen von KI, die speziell auf die Anforderungen der Verwaltungsprozesse zugeschnitten sind. Teilnehmer*innen lernen, wie KI-Tools praktisch eingesetzt werden können, um Routineanfragen zu automatisieren und Daten optimal zu verwalten. Darüber hinaus werden die rechtlichen Rahmenbedingungen und Datenschutzrichtlinien behandelt, um eine verantwortungsvolle Nutzung von KI in der Verwaltung sicherzustellen.

Learning goal

Verständnis erlangen, wie KI die Effizienz in Verwaltungsprozessen steigern kann
Praktische Fähigkeiten im Umgang mit KI-Tools zur Automatisierung und Datenanalyse entwickeln
Kenntnisse über die rechtlichen Rahmenbedingungen und Datenschutzrichtlinien im Zusammenhang mit der Nutzung von KI in der Verwaltung erwerben
Einblicke in konkrete Anwendungsfälle von KI in Verwaltungsprozessen gewinnen
Strategien zur effektiven Formulierung von Anfragen an KI-Tools erlernen und die Integration dieser Tools in den Verwaltungsalltag optimieren

Skills

Trainer

Matthias Eulert

Next appointment

Date	Link
12.06.2025	https://academy.gwdg.de/p/event.xhtml?id=67331bd65d441669671bc620
22.08.2025	https://academy.gwdg.de/p/event.xhtml?id=6826194d298a9177e714d859
25.11.2025	https://academy.gwdg.de/p/event.xhtml?id=682647ca298a9177e714d877

Data Management in HPC

These courses cover data management knowledge and services from out catalogue.

Data Management Concepts for Efficient and User-Friendly HPC

Content

Data management is generally challenging, particularly on HPC systems. Modern HPC systems offer different storage tiers with different characteristics. Some of these characteristics are for instance the availability of backups, the storage capacity, the IO performance, the difference between node local and globally available access, the semantics of the storage system, and the duration for which the storage endpoint is available, ranging from years to quarters, and sometimes only hours. This is confusing and entails different challenges and risks. First of all, users have to be aware of the different storage tiers and their performance profiles to optimize their job runtimes and not leave their jobs starving for data or wait for minutes that a Python environment has been loaded. However, users then need to move their results back to a storage tier with enough space and durability, to not lose their results at the end of a computation or soon after. While moving input and output data around users have to keep oversight over the data provenance to ensure the reproducibility and retrospective comprehensibility of their research. In addition, sometimes users don’t just want to copy an entire data set but want to explore only a concise subset. For this, a data catalog can be used where all available data is indexed with respect to some domain-specific metadata. Once this data catalog is filled with all the data sets of a user, concise queries can be used to select the input data, and ideally, stage it to the correct storage tier as part of the job submission process. This data catalog can also be used to keep the oversight of all data that are distributed over the different storage tiers.

This course will provide an introduction to the different storage tiers available at GWDG and for what workloads they should be used. Then the concept of a data catalog and its usage of will be covered. Both parts will offer hands-on exercises on our HPC system.

Requirements

Some basic experience with working with HPC systems

Learning goal

Learn the concept of storage tiers and how to properly use them for best performance and data durability
Learn the concept of a data catalog and how to use it to select input data for an HPC job based on domain specific metadata

Skills

Trainer

Dr. Hendrik Nolte

Next appointment

Date	Link
06.03.2025	https://academy.gwdg.de/p/event.xhtml?id=6731d4a75d441669671bc5f0
09.10.2025	https://academy.gwdg.de/p/event.xhtml?id=6826412d298a9177e714d869

Using the GöDL Data Catalog for Semantic Data Access on the GWDG HPC Systems

Content

Data management is generally challenging, but particularly on HPC systems. Due to the tiered storage systems, data may reside on different storage systems. Particularly data-intensive research often have large data sets, with many files. Using the well-established practice of encoding semantic metadata in paths and filenames can quickly accumulate, rendering it hard to employ on very big data sets.

A different approach is to use a data catalog, where a set of metadata tags can be indexed and associated with individual files. This allows to identify and access files based on semantic queries, not based on overly complicated paths.

This course will provide a basic introduction into the Data Catalog tool provided by the GWDG on all of its HPC systems. Following a short presentation, participants can explore the tool during a hands-on session on their own.

Requirements

Basic experience with HPC systems
Basic experience with data management

Learning goal

Understand the concept of a data catalog and how to apply them in your use cases Learn how to use the GöDL Data Catalog to ingest, search, stage and migrate your data as part of an overarching HPC workflow

Skills

Trainer

Dr. Hendrik Nolte

Next appointment

Date	Link
13.03.2025	https://academy.gwdg.de/p/event.xhtml?id=6734551a5d441669671bc634
30.10.2025	https://academy.gwdg.de/p/event.xhtml?id=6826435c298a9177e714d86e

Content

The scientific community relies heavily on the sharing of standardized datasets, such as ImageNet or Sentinel-2 imagery. To host these popular datasets in a central store, the GWDG offers the Data Pools service. Compared to conventional cloud-based approaches, we achieve significantly higher performance with Data Pools when running on our HPC systems. Additionally, the GWDG provides a number of standard datasets and derived data products, such as machine learning models. This service is not only for users to consume data but also allows them to share and host their own versioned datasets within our HPC systems. Other users of our systems can then use your dataset or data products to conduct their own research. Data Pools are specifically designed for the scientific community, providing versioned datasets that are citable.

Within this course, we will teach you how to discover existing Data Pools and how to publish your own dataset as a Data Pool to share it with others.

Requirements

Basic Linux and HPC experience

Learning goal

Understand the concept of data pools
Learn how to discover existing data pools
Learn to publish your own dataset as data pools

Skills

Trainer

Next appointment

Date	Link
18.03.2025	https://academy.gwdg.de/p/event.xhtml?id=67934800dc01d6441dda7807
16.10.2025	https://academy.gwdg.de/p/event.xhtml?id=682641a5298a9177e714d86a

Application Science / Science Domains

These courses cover specific scientific domains as well as specific pieces of software.

Ansys on Cluster and Post-Processing of Simulation Results

Content

ANSYS is a popular Software Suite used for simulations in engineering applications which range from optical systems over mechanical systems to space-borne systems. Often users start using ANSYS on local Windows-based workstations. Since the learning-curve to use a Linux-based HPC-systems is steep, alternatives to use clusters without Linux knowledge are very attractive. The ANSYS Remote Solver Manager (RSM) offers this possibility by submitting simulation jobs to clusters without exposing the underlying Linux System. The first part of the course covers this possibility in addition to the classical way of submitting jobs to SLURM. The second part of the course briefly demonstrates different methods to post-process simulation data using the GUI and python-based approaches.

Requirements

Using the GWDG SCC and/or GWDG NHR clusters, including Slurm job submission
Local ANSYS License at Institute
ANSYS Simulation case needed for testing
Either a GWDG NHR account, or a course account for GWDG NHR
Practical experience with the Linux command line, Linux shell scripting, ANSYS Workbench

Learning goal

Understanding the configuration of Ansys Remote Solver Manager (RSM)
Executing simulation jobs using RSM and SLURM
Post-processing Ansys Simulations using internal GUI and Python

Skills

Trainer

Next appointment

Date	Link
03.12.2025	https://academy.gwdg.de/p/event.xhtml?id=682648b5298a9177e714d87a

Debugging Scientific Applications - Illustration on OpenFOAM

Content

The development of scientific software has its unique challenges compared to general application software. In particular the parallel execution of computations on multiple cores and nodes is specific. Since, the debugging parallel codes with race conditions is challenging, this course offers an introduction to advanced developers to tackle this problem. However, also single core simulation codes can be difficult to debug for beginners. For this group, the course gives also an introduction to the use of gdb and Visual Studio Code for the debugging of single core codes. As example the training uses the open source cfd software OpenFOAM.

Requirements

Basic understanding of C, C++ for serial applications
Experience with MPI for parallel applications
Practical experience with the Linux command line and cluster environments
Either a GWDG NHR account, or a course account for GWDG NHR

Learning goal

Understanding the configuration and usage of Visual Studio Code and gdb for debugging
Executing and debugging a simple example problem
Understanding the configuration and usage of Totalview

Skills

Trainer

Next appointment

Date	Link
28.05.2024	https://academy.gwdg.de/p/event.xhtml?id=673458695d441669671bc638

Introduction to AlphaFold

Content

AlphaFold is a groundbreaking machine learning tool in the complex field of protein folding. Simple to use, but hard to master and understand, AlphaFold provides a vital step in many bioinformatics and molecular simulation workflows. In this tutorial, we will cover the theoretical background of AlphaFold, show examples of incorporating it into a research workflow, provide an opportunity to perform hands-on AlphaFold simulations, and explore advanced techniques and alternative algorithms.

Requirements

Knowledge of HPC usage in general, working with a terminal
Background knowledge of basic biological concepts (DNA, proteins, protein synthesis, etc)
Not necessary/will be covered, but nice to have: AI/ML, molecular simulations, GPUs

Learning goal

Basics of protein folding with AlphaFold
Incorporating AlphaFold into your workflow
Hands on folding and simulation experience

Skills

Trainer

Next appointment

Date	Link
29.10.2024	https://academy.gwdg.de/p/event.xhtml?id=66446e929f7c5b49f23998c8
09.09.2025	https://academy.gwdg.de/p/event.xhtml?id=68263d79298a9177e714d861

Introduction to Research Software Development with MATLAB

Content

With this workshop you get a running start at pragmatic and collaborative development of maintainable research software with MATLAB.

About the Presenters:

Dr. Thomas Künzel is a member of the Academic Customer Success Team at The MathWorks. He supports the successful application of MATLAB/Simulink in basic research. Thomas studied biology at the Ruhr-Universität Bochum and worked in auditory neuroscience research as a post-doc and group leader in Rotterdam and Aachen. He holds a PhD in zoology and animal physiology.

Dr. Franziska Albers is part of the Academic Customer Success team at MathWorks, which helps academics integrate MATLAB and Simulink in their teaching and research. She studied physics in Heidelberg and Münster and holds a PhD in neuroimaging from University Hospital Münster. Her research background is in MR physics and medical imaging.

Requirements

This workshop is for researchers who are somewhat familiar with MATLAB and want to write more sustainable code. The workshop will be held in English.
You will need a MathWorks account to log into the online workshop environment. No software installations are necessary but if you want to work on your local machine, please install MATLAB R2023b or newer (no toolboxes needed).

Learning goal

Learn how to write Clean Code and use modern MATLAB language and IDE features to achieve it.
Learn about the concept of writing tests for your code and how you can implement them with MATLAB.
Understand how to use git for local source control of your MATLAB code and why it is important for maintaining code quality

Skills

Trainer

Dr. Thomas Künzel
Dr. Franziska Albers

Next appointment

Date	Link
30.10.2024

Parallel Computing with MATLAB Part I: Parallelize on your local machine

Content

Are you curious about speeding up your computations with MATLAB? Join us for this hands-on workshop to learn how to speed up your applications on your local machine.

This workshop will focus on speeding up MATLAB on the desktop with Parallel Computing Toolbox. We will discuss best coding practices for performance in MATLAB, basic parallel programming constructs and GPU computing.

This workshop is a hands-on workshop, you will have time to work on exercises and try the concepts on your own.

About the Presenters:

Requirements

This workshop is for researchers who are familiar with MATLAB and want to speed up their computations. You should know how to use for loops in MATLAB and how to write and call functions. The workshop will be held in English.
You will need a MathWorks account to log into the online workshop environment. No software installations are necessary.

Learning goal

Participants will be able to apply best coding practices and to use parallel constructs on their local machine to speed up computations.

Skills

Trainer

Dr. Thomas Künzel
Dr. Franziska Albers

Next appointment

Date	Link
30.10.2024

Parallel Computing with MATLAB Part II: Scaling up to the GWDG Scientific Compute Cluster

Content

Are you curious about speeding up your computations with MATLAB? Join us for this hands-on workshop to learn how to scale your applications to the GWDG Scientific Compute Cluster (GWDG SCC).

This workshop will focus on best practices for scaling MATLAB Code to the GWDG Scientific Compute Cluster. Attendees will learn how to configure MATLAB to submit jobs to the cluster, best practices for optimizing job submission as well as troubleshooting and debugging.

About the Presenter:

Damian Pietrus is Parallel Computing Application Engineer. He has a BA from Harvard University, and he joined MathWorks in July of 2017 working in front-line support before joining the Parallel Team in late 2019. Damian's focus is on integrating MATLAB Parallel Server with various HPC environments and helping users to take advantage of these expanded computing resources.

Requirements

This workshop is for researchers who are familiar with MATLAB and basic parallel programming constructs like parfor. The workshop will be held in English. For the practical exercises participants will receive course accounts to access the GWDG Scientific Compute Cluster. You will need a local installation of MATLAB on your laptop to submit jobs to the cluster. Please install:
- MATLAB R2023b
- Parallel Computing Toolbox
If you need help with the setup or do not have access to a MATLAB license, please contact Franziska Albers in German or English.

Learning goal

Participants will be able to scale their MATLAB computations to the cluster.

Skills

Trainer

Damian Pietrus

Next appointment

Date	Link
06.11.2024

Snakemake for HPC Workflows

Content

Current HPC workflows can become quite complex, beyond the scope of simple bash scripting or scheduler job dependencies. Snakemake introduces robust tools to generate elaborate workflows, including stage-in and -out options, and interaction with different job schedulers. Snakemake is particularly popular in the field of Bioinformatics, so researchers of this field will profit from attending the course. AI/ML is another discipline that can benefit from organizing job dependencies. In general, the content and experience gathered in this course should be applicable to any discipline where complex interactions between files and jobs and different stages of analysis are involved. The course will consist of an introduction to Snakemake, its concepts and capabilities, as well as 1 or 2 exercises where users will have to develop and apply their own workflows for a Bioinformatics and/or AI/ML task.

Learning goal

How to use snakemake to generate hpc workflows
Examples from bioinformatics and machine learning fields, but content should be applicable to any discipline

Skills

Trainer

Dr. Martin Leandro Paleico

Next appointment

Date	Link
22.05.2024

HPC Workshops

Deep dive into high end HPC topics with our workshops. These special events are organized throughout the year with hot topics, hands-on sessions and invited speakers, who are experts in their field.

The HPC-related workshops and events are all managed under the Indico@GWDG.

Also have a look at the GWDG Events for more GWDG-related events.

Self Paced tutorials

These tutorials are designed to help users get to know the subject without much prior knowledge. Thy are tailored to the systems present at the GWDG but can also be used to understand the underlying principles.

Linux bash tutorial

Warning

This page is a work in progress and subject to change.

If you would like to share some feedback with us regarding this tutorial, you can write us an Email and put [GWDG Academy] into the title.

This tutorial is designed to walk you through the basic commands you need in order to understand the Linux Bash program. Follow the steps one after each other to fully grasp the concept. Each step builds on the results of the previous one.

Table of contents:

Step 0 - Opening a terminal

Open a terminal. In Windows you can try out the Powershell and SSH to log into a Linux system or you use the Windows sub system for Linux (WSL). Under Mac and Linux, you can simply search for terminal and use that.

In any case, you will need a Linux style terminal for this tutorial. If you do not have access to such a system, you can apply for an SCC, or NHR test account, which is detailed on this page.

This now is your working environment and the playground we will explore in the next steps.

Step 1 - The first command

You should have a blinking or solid courser now. This program now requires you to type a command via keyboard and press Enter or Return to run it. Mouse input is only partially possible and will not be covered here.

Please try out the commands:

echo Hallo
echo Hallo world
echo "Hallo world"
echo "Hallo\nword"
echo -e "Hallo\nword"

What did you observe? This was your first command.

A command usually follows this syntax: command <-OPTIONS> <--LONG_OPTIONS> <ARGUMENTS>

The options can use one dash or two dashes. Generally (according to the POSIX standard) short options with a single letter use a single dash and longer options use two dashes. You will learn a very many commands during the next few steps.

Step 2 - Navigating file system

We will now explore the file system. For this we need a few commands. Please try out the following commands:

ls
ls -l
ls -l -h
ls -a
ls -la
ls -la --color

What do you see? This command is the list command, which lists the contents of a folder. The output will depend on where you are currently located in the file system and it might also be empty for a fresh account in our system.

If you want to find out where you are, use the command pwd. What do you see? This command is called print working directory.

Under a UNIX file system folders are separated by a / sign, in fact the root of the file system is also called /. You can get there using our next set of commands

cd /
ls -l
cd ~
ls -la
cd /etc
ls -la
cd $HOME
pwd
ls -la

The new command is called cd and stands for change directory. This is your main tool to navigate through the file system.

With this set of commands you navigated to three different locations. You have now seen the root of the file system located at /, and you have seen the special folder called etc. Also you have seen two ways to get back to where you were before, which is called the home directory $HOME, for which ~ is an alias. This home directory is always your starting point. Every user has one and this is where you store your own files and folders.

Step 3 - The two special folders

You may have already noticed the two folder . and .., which are special folders that are everywhere.

The . folder is the current one. Try out cd . multiple times and do the pwd command after every try. You should be able to do this infinitely, since you change into the folder you are currently in every time.

The .. folder is the folder up in the file system, meaning on folder higher in the structure. For example, if you are in the folder /home/user/my-folder/ you can run the command cd .. and if you run the command pwd you will find yourself in the folder /home/user/. Try it out for yourself by running these commands:

pwd
cd ..
pwd
ls -la
cd ..
pwd
ls -la

Do this until you reach the root of the file system.

What you have done now is addressing the file system in a relative way. You addressed the next folder from the current folder.

Going from the root of the file system and navigating like this cd /home/user/my-folder is called absolute addressing.

Both way are useful and using them depends on the system. Addressing absolute is very stable on the same system and is always correct unless the folders do not exist any more. Addressing relatively is more versatile and independent of the system.

Step 4 - Creating files and folders

Now that you know how to navigate the file system, it is time to start creating files and folders yourself. An easy way for creating files is the touch command. Try out this set of commands:

cd $HOME
pwd
ls -l
touch myfile.txt
touch another_file
touch also a file
touch "also a file"
ls -l

What did you observe? You should now have 6 new files, but why so many?

Now lets create a folder. The command for this operation is called mkdir, which stands for make directory. Try out this set of commands:

cd $HOME
pwd
ls -la
mkdir demo.folder
mkdir another folder
mkdir "another folder"
mkdir another\ folder2
mkdir -p demo.folder
ls -la

How many folders do you have now? 4? The option -p only creates a folder if is does not exist.

This should demonstrate that spaces in the arguments lead to each word being it’s own file or folder. You can avoid that by using quotations marks, or just avoid space in names completely and replace it with for example and underscore _ or a dash - or another symbol. You can also use a . but starting with a dot has a different effect. Try out these commands:

cd $HOME
mkdir tryout
cd tryout
ls -la
touch .hidden_file
ls -l
ls -la

Step 5 - Deleting files and folders

Now that you have many files you might want to get rid of some of them. The command to do this is called rm and stands for remove. Try out these commands:

cd $HOME
mkdir demo2
cd demo2
ls
touch demo.file
ls
rm demo.file
ls
touch demo.file2
ls
rm -f demo.file2
ls
touch demo.file3
ls
rm -i demo.file3
ls

This commands deletes files without regard for safety. If you would like to be asked for conformation, you need to use the option -i. This can be overwritten by the option -f, which forces the deletion.

Removing folders works similarly. Try out these commands:

cd $HOME
mkdir demo-delete
cd demo-delete
pwd
cd ..
pwd
ls
rm -ri demo-delete
ls

mkdir demo-delete2
cd demo-delete2
pwd
touch some_file
cd ..
pwd
rm -ri demo-delete2
ls

mkdir demo-delete3
cd demo-delete3
touch some_other-file
cd ..
pwd
ls
rm -rf demo-delete3
ls

Warning

Deleting files can go very wrong and is mostly not recoverable. Make sure you double check the command!

A classic user error we see is running the command rm -rf / and writing us an email why we delete all of the users files. What do you think happened?

In order to delete folders you need to use the recursive flag rm -r. Also, in this example we experimented with interactive and force mode to observe the differences.

If you do not always want to write the recursive flag, you can use the command: rmdir.

Step 6 - Copying and moving files and folders

Often it is needed to copy files or to move them to another part of the file system. For moving files the command is called mv and it works the same for files and folders. Try out this set of commands:

cd $HOME
mkdir second_folder
mkdir first_folder
mv second_folder first_folder/
touch super_file
mv super_file first_folder/second_folder/
ls
ls first_folder
ls first_folder/second_folder
rm -ri first_folder
ls

Here we created two folders, moved one into the other and moved a file into the second folder. Also, we removed the entire part of the tree with a single command.

Now, for copying we use the copy command: cp. This one works differently to the move command and requires a recursive flag for folders: cp -r. Try out these commands:

cd $HOME
mkdir second_folder
mkdir first_folder
cp -r second_folder first_folder/
ls
rm -r second_folder
touch super_file
cp super_file first_folder/second_folder/
ls
ls first_folder
ls first_folder/second_folder
rm -ri first_folder
ls
rm -i super_file
ls

Since copy creates a duplicate this operation could take a while for large files or folders. If you only want to move something, using the mv command is much faster since the file content is not touched but only the indexing of where it is located in the file system changes.

Step 7 - Reading files

Now you know how to create, copy, move, and delete files and folders. This looks a bit academic without actually reading the content of files.

The easiest way to read the content of a file is by using the command: cat. It will print the content of a file into the terminal.

If the files are a bit long, you can also try to use the two program head and tail which give you the first few or last few lines. You can set the number of lines by using the -n <number> option.

In case you want to read the full file and also move up and down to actually read it, you can use the program called less. This is called a pager program because it can display full pages of a file and also scroll up and down. In order to get out of that program, you can simply press the q button. Try of these commands:

cd $HOME
cat /etc/aliases
cp /etc/aliases ./
cat aliases
head -n 2 aliases
head -n 10 aliases
head aliases
tail -n 2 aliases
tail -n 10 aliases
tail aliases
less aliases
rm aliases

Step 8 - Finding files

Finding files is very useful especially if you remember a portion of the name and would like to find out where you have stored it. The write of this tutorial has lost many files in deep file system structures and so far always found them again using these commands.

You can use two different commands for this operation. The first is called locate and the second is called find. The locate command is a bit simpler since you just specify the portion of the name you remember and it will look starting from the directory you are in right now. This command may not be installed on all systems. The find command has many more options and can be used for very convoluted searches but generally is the better command if you need something specific. In order to explore these command, try out this set:

cd $HOME
mkdir folder1
mkdir folder1/folder10
mkdir folder1/folder11
mkdir folder1/folder12
mkdir folder1/folder13
mkdir folder1/folder14
mkdir folder2
mkdir folder2/folder20
mkdir folder2/folder21
mkdir folder2/folder22
mkdir folder2/folder23
mkdir folder2/folder24
cp /etc/passwd folder1/folder13

locate passwd
find . -name "passwd"
find . -name "*pass*"
find . -name "*sswd"
find . -name "*ssw*"

ls
rm -rf folder1
rm -rf folder2
ls

Try to compare the results and you may also copy the file to a different location and see if both are found. Similarly, you can rename the file and see if the patters match somehow.

Info

Renaming has not been done before, so think about it. You can copy into a differently named file or move the file to a different name.

You may have noticed the * symbol, which is used as a wildcard here. It can be used to be replaced by any number of characters and is very useful if you only know part of the name or are looking for something specific like all files ending with *.txt.

Step 9 - Searching the content of a file

Now that we can find files, we can also search the content of a file. This can be done with the very useful command called grep. Many scripts and advanced use cases involve this command so it must be useful. This example will explore the file searching abilities of this command, but it can do a lot more.

For now, what we cant to do is look for a specific pattern in a file. This can be done with this syntax: grep PATTERN FILE, where pattern can also be a regular expression. Try out these examples for a quick idea:

grep root /etc/passwd
grep -i RooT /etc/passwd
grep -n root /etc/passwd
grep -v root /etc/passwd

This command cannot only operate on files, it can also check in a path for files containing a pattern. For simplicity, we will recreate the setup of the last step and run the grep command:

cd $HOME
mkdir folder1
mkdir folder1/folder10
mkdir folder1/folder11
mkdir folder1/folder12
mkdir folder1/folder13
mkdir folder1/folder14
mkdir folder2
mkdir folder2/folder20
mkdir folder2/folder21
mkdir folder2/folder22
mkdir folder2/folder23
mkdir folder2/folder24
cp /etc/passwd folder1/folder13
cp /etc/passwd folder2/folder22/passright
cp /etc/passwd folder2/folder23/passwrong

grep -R root ./
grep -R root folder1/
grep -R root folder2/

ls
rm -rf folder1
rm -rf folder2
ls

Step 10 - TAB-Completion, stopping a stuck program, and clearing the screen

You have now written or copied many lines of folder instructions, especially for navigating, but there is a fast and easy way to do it. The bash offers an autocompletion feature. If a command or path can be expanded automatically, pressing the tab key will complete it until it is no longer unique. In case nothing happens, you can press the key again or press it twice to begin with. If there are multiple matching option, this will print them and you can select which to use by writing the next letter. You can explore this with by navigating folders:

cd $HOME
mkdir folder
mkdir folder/folder1
mkdir folder/folder2
mkdir folder/folder1/folder3
mkdir folder/folder1/folder4
mkdir folder/folder1/folder5
cd fol <TAB>
cd folder/folder <TAB><TAB>
rm -rf folder

Sometimes a program is running and blocking the bash from processing further inputs. This can be intentional, but it can also be a program that is stuck. You can always interrupt the currently running program by pressing the key combination CTRL+c. You can explore this with the following example that uses the sleep command. This command simply waits for the amount of time specified. Try out these commands:

sleep 30 <CTRL>+c

Now that you have many commands on your screen, you may want to clear it so you do not loose the next commands in a wall of text. This can be done with the clear command. Go ahead and try it.

Step 11 - Getting help (RTFM)

You have learned many commands, especially the sleep command. Have you ever wondered what time options you have with this command? The easiest solution is to use the very comprehensive help options.

Most commands have a built in help option following this syntax COMMAND --help, COMMAND -h, or COMMAND help. Alternatively, you can always check the manual 😉.

This manual can be reached using the man command. It will open the manual page in a pager program such as less. As described above, you can always get out using the q button and searching can be done by pressing the / (forward slash) and entering the word you are looking for.

Can you find the option to use the sleep command for 5 hours? There are multiple ways to do this. Try out the manual for the other commands.

Step 12 - A basic file editor

So far we have only been reading files, but what about editing? There are several command line editor which range from easy to use (nano), to very powerful (emacs), to strange and handy if you know what you are doing (vim). This page has been written using vim just because the editor likes it. Feel free to explore the other editors, but we will focus on nano.

You can open the editor either with or without a filename as the argument. Navigating the lines can be done by using the ARROW keys. Typing works like any other editor.

The options for saving and others are written at the bottom of the editor page with their key codes and you may need to know that this symbol ^ represents the CTRL. Here are some of the more important ones with their key combinations:

CTRL+o Saving the file, supply the name or accept overwriting the file name
CTRL+x Exit the editor, you will be prompted to save the file or discard changes
CTRL+u Paste from the clipboard at the cursor location
CTRL+w Search in the document
CTRL+a/e Jump to line start/end
CTRL+y/v Scroll page up/down
ALT+u/e Undo/Redo changes

Explore the other option on your own and consult the manual for further information.

Step 13 - Variables and the environment

As with every program, you have the ability to use variables. In fact, you have already used one, which is called $HOME. You can always check the content of variables by using the echo command. Try out these commands to check some variables:

echo $HOME
echo $SHELL
echo $PATH
echo -e ${PATH//:/\\n}

Something new here is that the curly brackets have more abilities. In this case, they replace the colon symbol : with a line break symbol \n. We will not explores this feature further but just tease that is exists.

There are very many variables already set as part of the environment this bash program is running in. You can see them all of them by using the printenv command.

At this point, we assume you have some basic understanding of programming and know what variables are and how they function. Setting variables yourself can be very useful for storing and processing information. You can simply assign a value to a variable using the = sign, but make sure that there is no space between the name of the variable and the = as well as to the values behind it. Try out these commands:

HALLO=World
echo $HALLO

HALLO=WORLD
echo Hallo $HALLO

HALLO="Hallo\n world"
echo $HALLO
echo -e $HALLO

These variables are only visible for this bash session for now. If you want to use them in other programs and scripts, you need to make them known to the environment. This can be done by using the export command. Once exported, you can find the variable using the printenv command. Removing commands can be done using the unset command Try out these commands:

HALLO=world
echo $HALLO
export HALLO
export myvar2="I have some more information"

printenv
printenv | grep -i HALLO
printenv | grep -i myvar

unset HALLO

printenv
printenv | grep -i HALLO
printenv | grep -i myvar

unset myvar2

This example does one more operation called a redirect or pipe with the | symbol. This will be explained in a later step but it shows again why the grep command is very useful.

Step 14 - Permanent settings

There are some locations/files where you can store your variables to be available each time you open a terminal or log into a server. One such file is called .bashrc and it is located in your home directory, reachable via the $HOME variable. You can freely edit this file using for example nano. There you can put all the exports and variables you like and they will always be available in a new terminal session.

One more interesting option to set in this file are aliases. You may have noticed that there are several commands, which have option you always use. For these you can assign aliases, for example one important one is alias rm='rm -i'. This will set the rm command up to always ask for permission to delete files, “much safer”. Some more interesting aliases are listed below, try them out:

alias rm='rm -i'
alias ls='ls --color'
alias ll='ls --color -l -h'
alias la='ls --color -a'
alias lla='ls --color -la -h'

alias cd..='cd ..'
alias ..='cd ..'
alias cd~='cd ~'
alias cd ='cd $HOME'

Try putting these changes into the $HOME/.bashrc file. Loading changed of that file into the currently used environment can be done using the source command like this: source $HOME/.bashrc.

Step 15 - Redirecting command output

You have already seen one redirect, via the | symbol, which is called a pipe. This takes the output the you would have seen printed in the terminal and redirect it as input into another program. The example you have already seen is piping some output into the grep command for filtering. The example was printenv | grep HOME and you also have access to all the option the grep command offers. For example you could also try this command, which should give a similar output: printenv | grep -i home.

There is one more interesting set of symbols, which are the chevrons >, >>, <, and <<. These are designed to redirect file output into commands or command output into files. For example you can quickly create files like this: printenv > my_env.txt. Now you can explore the content of this file using one of the tools you have already learned.

These also work with a command like echo. Try out these commands to explore the difference between > and >>:

echo "This is interesting" > my-file.txt
cat my-file.txt
echo "This is also interesting" > my-file.txt
cat my-file.txt

echo "This is interesting" > my-file.txt
cat my-file.txt
echo "This is also interesting" >> my-file.txt
cat my-file.txt

rm -i my-file.txt

echo "This is interesting" >> my-file.txt
cat my-file.txt
echo "This is also interesting" >> my-file.txt
cat my-file.txt

cat < my-file.txt

rm -i my-file.txt

Using one of the chevrons will create a file and override the content if it already existed and two chevrons will append to the file. You need to be careful using a single chevron, since you cannot recover deleted content. Make sure you either always manually delete the file and always user double chevrons or take great care before executing any command. It was at this moment, when the writer remembered many painful hours of recovering files overridden by being tired or distracted and forgetting a chevron.

Using the backwards chevrons does the same but takes a file and give the output to a command. The example above is a bit academic but using cat < my-file.txt at least demonstrated the use of them. Generally, the forward chevrons are used much more often.

Step 16 - Stepping into the history

The bash keeps a history of all the commands you have used in the past. You can scroll through them by pressing the UP-ARROW key. Similarly, you can print the entire history using the history command. Since this will be a rather long output, feel free to use your knowledge about redirecting and filtering to get only relevant commands. This can be done with grep for example: history | grep cat. You can clear the entire history with the -c option.

There are also some useful short cuts you can take with the history. For example the command !! (called: bang-bang) will run the last command again. Similarly the command !N will run the Nth command again. Or you can give part of the command as well like so:

!TEXT: Will run the last command starting with “TEXT”
!?TEXT: Will run the last command containing “TEXT”

Step 17 - Permissions system

This is now a more advanced concept. We include it here because it is very important to understand who can see and edit your files.

Every user has a username (for example: uxxxxx) and belongs to a few groups (for example: GWDG). The user is unique but might share a group with other, like the writers colleges who are also working for the GWDG. This allows on to create a file and allow the colleges to either read it or even write to it, without anyone else even know that this file exists. In order to set this up, you need to understand how Linux is using these permission.

Every folder and file has a permission system, which can be display using ls -l and you may have already seen it before. The output may look like this:

drwxr-x--- 2 uxxxxx GWDG 4096 Jan 27 11:20 test
-rw-r----- 1 uxxxxx GWDG    0 Jan 27 11:20 test.txt

The permission output is always in front, you also get information about who owns the file in this case uxxxxx and which group this files belongs to GWDG. These information are followed by the size and the last change date.

We will focus on the permissions for now. The first part contains a single letter followed by three triplets and we will simplify the output a bit:

folder	user	group	everyone	file name
d	rwx	r-x	—	test
-	rw-	r–	—	test.txt

The first letter indicates if this is a folder or not. The first triplet are the permission for the owner/user, the second triplet are for the group, and the last triplet is for everyone else. As you can see, there are three option for each triplet and they are rwx. The ones that are not used have a - instead of the option, showing that this permission is disabled.

Reading a file or folder is only allowed if the r option is set.
Writing to a file or folder is only allowed if the w option is set.
Executing a file or entering a folder is only allowed if the x option is set.

Each of these apply separately to the owner, group, and everyone. In the example above, you can see that the file test.txt can be read and written to by the owner/user. The group can only read the file, and everyone else can not even read it.

The same applies to the folder, but since it is a folder it also requires the x option to be set, so the owner/user or the group can enter it.

Step 18 - Changing permissions

Let say you have a folder you would like to share with your group and they should be able to edit files you have in that folder. The group, you want to share it with is called GWDG and your user is a member of that group. The command to update the permissions is called chmod. It takes two arguments, the first is the permission you would like to set and the second is the file or folder you would like to update. Let’s start with the permissions. You can update the permission for either the owner/user u, the group the file belongs to g, everyone outside the group o, or all three a. Additionally, you have the option to give permission +, take away permission -, or set them to something specific =. Here are some examples and observe the change after every command:

cd $HOME
mkdir demo-perm
cd demo-perm
touch file.txt
ls -l

chmod u-w file.txt
ls -l
chmod u+w file.txt
ls -l

chmod g-w file.txt
ls -l
chmod g+w file.txt
ls -l

chmod a-w file.txt
ls -l
chmod a+w file.txt
ls -l

chmod u=rw file.txt
ls -l
chmod g=rw file.txt
ls -l

chmod u=rw file.txt
chmod g=r file.txt
chmod o= file.txt
ls -l

cd ..
ls -l
chmod -R a+w demo-perm
ls -l
ls -l demo-perm

rm -rf demo-perm

You have also now seen that you can change the permission of all files in a folder by using the -R flag, which runs the command recursively over the folder.

Warning

This is a powerful command, which can lead to many problems if not handled properly. One example of what happens regularly on a multi user system is that someone sets write permissions to everyone since it is simple and another user deletes all files accidentally since it is allowed according to the permission system. Think before changing the permissions.

Step 19 - Making files executable

If you have been looking for programs online, you may have come across the request that you make this program executable if you would like to try it. This is also handled by permissions, remember the x option. This option allows folders to be entered and allows the bash to run a file like a program. This option can be set by the command chmod u+x file.txt.

We will explore a possible use case for this permission in the next step. As with all things, you need to be careful which files you make executable since you may accidentally allow a virus or malicious program to run if you do not know what the program actually does.

Step 20 - Shell scripting

You have seen many commands and explored many options for changing files. You also know about variables and have maybe wondered if you can create programs using the bash. Short answer, YES!

A bash program is called a script since the bash runs through it like a list of commands and executes them one after the other. Usually, the filename contain the ending .sh to make it clear that this is a program for the bash (or shell). This file always starts with a line like this #!/bin/bash, which is called the bang pattern and tells which program needs to be used for executing the commands. This also works for other types of interpreters like perl #!/usr/bin/perl or for example python #!/usr/lib64/python. Take care with the paths to the programs, they need to accurate. You can find default paths using the whereis command, which you can try out for the bash whereis bash.

After this initial line, you can write every valid command for the bash, for example a script could looks like this:

#!/bin/bash

#Always go home first
cd $HOME

#Set some initial variables
FOLD=my-folder
FILES=my-files

mkdir $FOLD
ls
cd $FOLD

#Create some files
touch "$FILES"
touch "$FILES"1
touch "$FILES"1
touch "$FILES"2
ls -l > $HOME/result.txt

As you can see, the marker for a comment is the # symbol.

You can copy this script into a file and change the permissions to make it executable. Say, you copied it into the file called my-script.sh, you can set the x permission by running the command chmod u+x my-script.sh. Now that the script has the correct permission, you can run it using the ./ logic. Meaning, you can run the script like so: ./my-script.sh.

Warning

Using rm on a variable can be dangerous in case the variable is empty. This would delete everything in the current folder or if the have started the command with a / will delete everything from the root of the file system that you have write access to.

A bash script can also handle if statement or for loops, but this is a tale for another tutorial (coming soon™).

Exercise - Test your knowledge

Now that you know about a lot of different commands, you can test you knowledge. We have prepared an exercise for you. In order to get this exercise, you will need to use a new command. This time, we will only give you the command and you can use the manual to find out about the programs abilities.

git clone https://gitlab-ce.gwdg.de/hpc-team-public/bash-tutorial.git

Now you should have a folder called bash-tutorial. Change into it and read the file called README.md. It contains all the information you need for doing this exercise.

The reason for using git here is that generally, code or other work is often distributed using this method.

If you would like to share some feedback with us regarding this tutorial, you can write us an Email and put [GWDG Academy] into the title.

Slurm tutorial

Warning

This page is a work in progress and subject to change.

If you would like to share some feedback with us regarding this tutorial, you can write us an Email and put [GWDG Academy] into the title.

Table of contents:

Requirements

This tutorial assumes that you have learned all that is mentioned in the bash tutorial. Feel free to explore that tutorial or just skip to the exercise and see if you can solve all the steps.

Course information

These steps shown here accompany the HPC courses give at the GWDG Academy. Since this course will be given for all three systems, we will keep this general. Please replace every mention of <course> with the values for the respective system which can either be NHR, KISSKI, or SCC. The exact replacements are located near the exercise.

Chapter 01 - The file system

Exploring the options and checking the quota

Every user has access to at least three different storage locations, which are stored in these variables:

$HOME
$WORK
$PROJECT

Go to each location and find the full path of the current working directory.

In the $PROJECT directory, create a folder which has the name of your user account. This folder can be accessed by everyone in the project so you need to check the permissions of that folder and adjust it. For now set the permissions to be not readable by the group. Check the other folders that are there and see if you can read them.

Once you have done that, check the quota of your storage by using the show-quota command.

Copy files from local system

Follow the instructions on the Data Transfer page to copy a file from your local computer to your folder in the $PROJECT directory. Choose a file which you could share or just create a new file and upload that one. If you do not know what file to upload maybe do this:

date +"%H:%M %d.%m.%Y" > my-file.txt
echo "This file belongs to me" >> my-file.txt

Now you can open the permission for the folder again so that all on the project can access it and make the file read only for group. Go around and see what the others have uploaded.

Chapter 02 - The module system

Module basics

In order to use any software on the HPC system you need to use the module system. A brief explanation can be found in the module basics page.

Here, you will check the $PATH variable. Do it first without any module loaded. Once you know what is stored in there follow these steps:

echo $PATH
module load gcc
echo $PATH

module unload gcc
echo $PATH

module load gcc/9.5.0
echo $PATH

Also, check module avail before loading a compiler and after. The of the command might change depending on the compiler you have chosen. Try this out for the module gcc and the module intel-oneapi-compilers. What has changed?

SPACK

If you cannot find a software you are looking for, one option is to use spack. Once you have loaded the module and sourced the SPACK set up file source $SPACK_ROOT/share/spack/setup-env.sh you have access to the full repository of packages. Check the list for SPACK packages to see if your favourite software is available.

As a simple example, try these steps:

module load spack
source $SPACK_ROOT/share/spack/setup-env.sh

spack install ncdu
spack load ncdu
ncdu

This might take a moment.

Chapter 03 - Slurm the scheduler

Here we will need to replace the tag <course> with one name depending on the system. Follow this table to know what to substitute:

System	Partition `-p`
NHR	`standard96s:shared`
KISSKI	`grete:interactive`
SCC	`scc-cpu`

First command

The first command we can try is this one:

srun -p <course> -t 02:00 -n 1 hostname

What do you observe? Runs this command again, did anything change? The general syntax is srun <options> <command to run> and the command to run in this example is called hostname.

You can also already try this command:

srun -p <course> -t 02:00 -n 1 /opt/slurm/etc/scripts/misc/slurm_resources

Interactive session

You can also allocate a node, or portion of the node for an interactive session. This way, you can get a terminal on a node to try out some stuff:

srun -p <course> -t 10:00 -n 1 --pty /bin/bash

The update to the command above is the interactivity flag --pty and that we run the command called /bin/bash, which starts the bash program on the node.

Once a node is allocated you can manually run the two commands from above. Run both hostname and /opt/slurm/etc/scripts/misc/slurm_resources.

Difference between -c and -n

Use the srun command and the two options -p <course> and -t 02:00 with the program /opt/slurm/etc/scripts/misc/slurm_resources. This time adjust the options -c and -n in order to get these combinations:

10 tasks
10 tasks distributed over 3 nodes
3 nodes with 3 tasks each
1 task with 5 cores
2 tasks per node on 2 nodes with 4 cores per task

Job scripts

Repeat the task from the last section but this time, write it as a job script (also called batch script). The template could look like this:

#!/bin/bash
#SBATCH -p <course>
#SBATCH -t 02:00
#SBATCH --qos=2h
#SBATCH -o job_%J.out

hostname
srun /opt/slurm/etc/scripts/misc/slurm_resources

Add the required combination of -c and -n to the script. Here are also two new options for a shorter queue time called --qos=2h and to redirect the output to a specific file -o job_%J.out, where the unique job id will replace the %J.

Check the output files and compare these results to the section before. Also, try to run them all at the same time.

Slurm commands

While you are at it, use these commands to check what is going on:

squeue --me
sinfo -p <course>
scancel -u $USER

The last one will cancel all you jobs, so take care.

AI Competence Training

The tools used in this course include:

For all tools you can check our documentation of the services.

Short course (30 minutes)

The shortened course can be watched as a video to support the self study process. This course is a quick learning unit without exercises:

You can find the short presentation slides here.

Long course (> 60 minutes)

The interactive course can be watched as a video to support the self study process. This video also contains exercises:

You can find the full presentation slides here.

Effectively Utilize AI Tools in Research

This is the material related to the GWDG Academy course: Effectively Utilize AI Tools in Research

The tools used in this course include:

For all tools you can check our documentation of the services.

The full course can be watched as a video:

You can find the presentation slides here.

How to read this material

The primary source of information is the video above and the slide deck that goes with it. During the course, you will be asked to try out some ideas and concepts. Feel free to pause the video and try it out. As a starter example you can find two versions below in the first exercise.

Finally, there are some quick tips listed under Cheat Sheet.

If you are done with this material and still have questions, feel free to join the Matrix channel for the AI community There you can reach the instructors and also interact with other users.

First exercise

Please copy the following text into Chat AI and press enter. This is the example mentioned in the presentation:

Please wtrie an emial to my bos with the following reQuest:
I would like a salray increase, becusse my car is old.
I have alerady worked for the GWDG for two loooong years.

You can please try out this example on your own. This text is not an actual email, but is just for teaching. Observe the result:

Subject: Some urgents tasks for you

Hi Julia,
I hope this email finds you well. I wanted to share a few updates and reminders for the week to ensure everything runs smoothly.
First off, we seem to be missing a consignment from the last shipment—could you please check in with Mr. Hauke to clarify the situation? Let me know what you find out.
Also, don't forget to save the date for the upcoming meeting with the international delegates. Speaking of which, it’s a good idea to confirm the flight details with them via email soon.    While you're at it, could you double-check how large the delegation is? That'll help us finalize the arrangements.
On the logistics side, we’ll need to book a hotel for their stay and plan an evening dinner. Let’s ensure these are sorted well in advance. Once everything is set, it would be great if you  could prepare an itinerary for them as well, covering all key events and times.
Thanks for handling these items. Let me know if you have any questions or need assistance with any of the arrangements.

Best regards,
Jennifer

Extract a Todo list

Cheat Sheet

Clarifying Output Format and Detail Level

“Summarize in [X] sentences” – limits the response length
“Answer in bullet points” – organizes information neatly
“Provide a step-by-step guide” – ideal for instructions or explanations
“Explain in simple terms” – makes complex concepts easier to understand
“Give a brief overview” – requests a concise response

Providing Context and Purpose

“Assume the reader is [background, e.g., a beginner, expert]” – adjusts response depth
“For a [specific audience, e.g., marketing team, students]” – tailors the response to the audience’s needs
“As if you’re a [specific role, e.g., teacher, scientist]” – applies a tone or level of expertise

Refining Tone and Style

“Use a professional/friendly tone” – adjusts the tone
“Write as if for a [report, blog, academic paper]” – aligns with different content styles
“Add examples to clarify” – includes relatable examples
“Make it sound enthusiastic/persuasive” – matches emotional tone

Controlling Depth and Specificity

“Focus only on [aspect, e.g., benefits, challenges]” – narrows the topic
“Provide a detailed analysis of…” – deepens the response
“Highlight key differences between…” – useful for comparisons
“Include relevant statistics if available” – adds factual support

Using Personas and Perspectives

“Answer as if you’re a [role, e.g., data scientist, historian]” – applies a specific perspective
“Imagine you’re explaining this to [a child, a beginner, a peer]” – adjusts complexity
“Pretend you are an expert in…” – adds authority to the response

Encouraging Creativity or Exploration

“Provide innovative ideas on…” – encourages creative suggestions
“Suggest alternative approaches to…” – explores multiple perspectives
“Give potential drawbacks and solutions for…” – anticipates challenges and responses
“List pros and cons of…” – generates a balanced view

Guiding Structure and Completeness

“Begin with an introduction, then cover…” – structures the response
“Conclude with a summary” – ensures a cohesive answer
“List any prerequisites for understanding this” – adds foundational knowledge if needed
“Include relevant terminology with definitions” – clarifies jargon

Constraining Responses with Limits

“Limit the response to [X] words/sentences” – controls response length
“Respond in [formal, informal] language” – specifies formality
“Answer briefly, focusing on the essentials” – keeps it short and direct
“Provide up to three examples only” – limits examples

KI in der Verwaltung

Dieses Material gehört zu dem GWDG Academy Kurs: KI in der Verwaltung

Diese Services werden in dem Kurs behandelt:

Diese Dienste sind auch alle in unserer Dokumentation der Services.

Wir haben zu dem Kurs ein Video erstellt:

Sie können die ganze Präsentation hier finden.

Cheat Sheet

Ausgabeformat und Detailgrad klären

“Fassen Sie in [X] Sätzen zusammen” - begrenzt die Antwortlänge.
“Antworten Sie in Stichpunkten” - organisiert Informationen übersichtlich.
“Geben Sie eine Schritt-für-Schritt-Anleitung” - ideal für Anweisungen oder Erklärungen.

Kontext und Zweck bereitstellen

“Gehen Sie davon aus, dass der Leser [Hintergrund, z. B. Anfänger, Experte] ist” - passt die Tiefe der Antwort an.
“Für ein [spezifisches Publikum, z. B. Marketingteam, Studenten]” - orientiert die Antwort an den Bedürfnissen des Publikums.
“Als ob Sie ein [spezifische Rolle, z. B. Lehrer, Wissenschaftler] wären” - wendet

Ton und Stil verfeinern

“Verwenden Sie einen professionellen/freundlichen Ton” - passt den Ton an.
“Schreiben Sie, als ob es für einen [Bericht, Blog, wissenschaftliches Paper] ist” - orientiert sich an verschiedenen Textstilen.
“Fügen Sie Beispiele hinzu, um zu verdeutlichen” - ergänzt die Antwort mit nachvollziehbaren Beispielen.
“Machen Sie es enthusiastisch/überzeugend” - passt den emotionalen Ton an.

Tiefe und Spezifik der Antwort kontrollieren

“Konzentrieren Sie sich nur auf [Aspekt, z. B. Vorteile, Herausforderungen]” - begrenzt das Thema.
“Liefern Sie eine detaillierte Analyse von…” - vertieft die Antwort.
“Heben Sie wichtige Unterschiede zwischen … hervor” - nützlich für Vergleiche.
“Fügen Sie relevante Statistiken hinzu, falls verfügbar” - ergänzt die Antwort mit Fakten.

Perspektiven und Personas verwenden

“Antworten Sie, als ob Sie ein [Rolle, z. B. Datenwissenschaftler, Historiker] wären” - setzt eine spezifische Perspektive ein.
“Stellen Sie sich vor, Sie erklären dies einem [Kind, Anfänger, Kollegen]” - passt die Komplexität an.

Kreativität oder Exploration fördern

“Liefern Sie innovative Ideen zu…” - regt kreative Vorschläge an.
“Schlagen Sie alternative Ansätze zu… vor” - erforscht mehrere Perspektiven.
“Geben Sie potenzielle Nachteile und Lösungen für… an” - antizipiert Herausforderungen und Reaktionen.
“Listen Sie Vor- und Nachteile von… auf” - generiert eine ausgewogene

Struktur und Vollständigkeit leiten

“Beginnen Sie mit einer Einleitung, behandeln Sie dann…” - strukturiert die Antwort.
“Schließen Sie mit einer Zusammenfassung ab” - sorgt für eine kohärente Antwort.
“Listen Sie alle Voraussetzungen für das Verständnis auf” - ergänzt Grundlagenwissen, falls nötig.
“Fügen Sie relevante Fachbegriffe mit Definitionen hinzu” - klärt Jargon auf.

Antworten durch Einschränkungen begrenzen

“Begrenzen Sie die Antwort auf [X] Wörter/Sätze” - kontrolliert die Länge der Antwort.
“Antworten Sie in [formaler, informeller] Sprache” - spezifiziert die Formalität.
“Antworten Sie kurz und konzentrieren Sie sich auf das Wesentliche” - hält die Antwort kurz und direkt.
“Geben Sie maximal drei Beispiele an” - begrenzt die Anzahl der Beispiele.

KI Kompetenz Training

Die Dienste in diesem Kurs sind:

Für all diese Dienste haben wir eine Seite in unserer Dokumentation als Service.

Kurzer Kurs (30 Minuten)

Wir haben zu dem Kurs ein Kurz-Video erstellt, welches den Selbstlernprozess unterstützen soll. Dies ist eine sogenannte “Quick Learning Unit” ohne Übungsaufgaben.

Die kurze Präsentation ist hier zu finden.

Langer Kurs (> 60 Minuten)

Wir haben zu dem Kurs ein interaktives Video erstellt, welches den Selbstlernprozess unterstützen soll. Dieses enthält auch Übungsaufgaben:

Die volle Präsentation ist hier zu finden.

Support

If you have questions or run into problems, you can create a support ticket by sending an email with your question or problem to the appropriate email address in the table below or use one of our support email templates below. This ensures that your request can easily be directed to an expert for your topic of inquiry.

Alternatively, you can reach us and other users via our #hpc-users Matrix room. It is a group chat, where a community of HPC users and experts can exchange information and is a good place to ask or learn about current temporary issues, outages, and ongoing maintenance. You can ask about any kind of issue, but if the answer is more involved or we need to know (or reveal) personal information in order to answer, we will ask you to write a ticket instead.

For urgent support, GWDG offers a general hotline: +49 551 39-30000.

Info

We are here to support you personally.

While we do not currently offer dedicated phone support for HPC, we can call you back for urgent requests. Put [URGENT] into the subject line of your message and include your phone number. We will get back to you as soon as possible. Please use the urgent tag sparingly and include as much concrete information (see below) as you can, to help us find the right person for your inquiry.

Email Address	Purpose
hpc-support@gwdg.de	General questions and problems (when in doubt, use this)
nhr-support@gwdg.de	for NHR users
kisski-support@gwdg.de	for KISSKI users
support@gwdg.de	Non-HPC issues (e.g. VPN)

When contacting our support, please always include as much relevant information as possible:

Your username!
What login node(s) you were using
Which modules you loaded
The exact commands you were running
If you have a problem with your jobs,
- include the job ID and
- the complete standard output and error (-o/-e <file>).
If you have a lot of failed jobs, send at least two outputs. You can also list the job IDs of all failed jobs to help us even more with understanding your problem.
If you do not mind us looking at your files, please state this in your request. You can limit your permission to specific directories or files.
Please open a new ticket for each issue. It is easier for us to merge two tickets if necessary than to split them up.

Announcement emails and archives

Any significant changes to our systems, downtimes, maintenance, etc. are announced via our mailing lists. Please do not ignore these emails, they often contain important information you need to know when using our HPC systems!

You can find an archive of each announcement sent via the respective mailing list under the following links:

hpc-announce-scc
hpc-announce-nhr
hpc-announce-kisski
Older SCC announcements (before June 2024): hpc-announce

Essentials

Connecting to SSH

Project Management

Chat AI and Request an Chat AI API Key

Known issues

Here are some known issues on the NHR systems:

Citing us

If you publish an article or presentation, please send us a message via one of our support addresses. This will enable us to effectively monitor and highlight the outstanding scientific work being carried out by our community.

NHR Systems

The NHR center, NHR-Nord@Göttingen, supports their users in the course of scientific research activities. To make sure this stays possible in the future, public visibility is very important. That is why we ask each user to mention the technical and scientific support in all

publications and
presentations on workshops and conferences.

Please follow the suggested denotation as close as possible such that the NHR centers are able to find these phrases again later. The user is free to append more text.

The authors gratefully acknowledge the computing time granted by the Resource Allocation Board and provided on the supercomputer Emmy/Grete at NHR-Nord@Göttingen as part of the NHR infrastructure. The calculations for this research were conducted with computing resources under the project <ID of your project>.

On posters please add the graphical logo together with the project id <ID of your project>: NHR-Nord@Göttingen (PDF version)

KISSKI

The KISSKI project supports their users from critical infrastructure with consulting and infrastructure. To make sure this stays possible in the future, public visibility is very important. That is why we ask each user to mention the technical and scientific support in all

publications and
presentations on workshops and conferences.

Please follow the suggested denotation as close as possible such that the NHR centers are able to search for these phrases later. The user is free to append more text.

The authors gratefully acknowledge the computing time granted by the KISSKI project. The calculations for this research were conducted with computing resources under the project <ID of your project>.

On posters please add the graphical logo together with the project id <ID of your project>:

KISSKI (PDF version)

Scientific Compute Cluster

Please add the following to the acknowledgement section of your publication. It will help us ensure continued funding for further system upgrades.

This work used the Scientific Compute Cluster at GWDG, the joint data center of Max Planck Society for the Advancement of Science (MPG) and University of Göttingen. In part funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – 405797229

The GWDG logo is provided on request from oeffentlichkeitsarbeit@gwdg.de.

CIDBN

Please add the following to the acknowledgement section of your publication. It will help us ensure continued funding for further system upgrades.

This work used the HPC cluster Sofja, the dedicated computing resource of the Göttingen Campus Institute for Dynamics of Biological Networks (CIDBN) which is hosted by the GWDG, the joint data center of Max Planck Society for the Advancement of Science (MPG) and University of Göttingen.

Version	Installation Path	modulefile	compiler	comment
3.14	/sw/tools/valgrind/3.14.0/skl/gcc.8.2.0	valgrind/3.14.0	gcc.8.2-openmpi.3.1.2	Gö
3.15	/sw/tools/valgrind/3.15.0/skl/openmpi.3.1.5-gcc.9.2.0	valgrind/3.15.0	gcc.9.2-openmpi.3.1.5	B

Version	Build Date	Installation Path	modulefile	compiler
R 3.5.1 (gcc)	06-oct-2018	/sw/viz/R/3.5.1	R/3.5.1	gcc/8.2.0.hlrn
R 3.6.2 (gcc)	05-feb-2020	/sw/viz/R/3.6.2	R/3.6.2	gcc/7.5.0
R 4.0.2 (gcc)	18-aug-2020	/sw/viz/R/4.0.2	R/4.0.2	gcc/8.3.0
rstudio 0.98.1102	01-Aug-2014	/sw/viz/R/rstudio_1.1.453

Field	Value
Base URL	`https://chat-ai.academiccloud.de/v1`
API Key	`your AcademicCloud key`
Model ID	`codestral-22b` (add others as needed)

Home

Subsections of Home

Start Here

Subsections of Start Here

Getting An Account

NHR@Göttingen

KISSKI

Scientific Compute Cluster (SCC)

Restrictions

Cluster Overview

Project/Account Groups and Purposes

SCC

NHR (formerly HLRN)

KISSKI

REACT

Institution/Research-Group Specific

DLR CARO

Islands

Types of User Accounts

User Account Types

Project Portal accounts

Project overview

Project details hierarchy

“Legacy” accounts

File/Directory Access from multiple User Accounts

Overview

Connecting (SSH)

Terminal Access via SSH

Example Terminal

IDE Access via SSH

Subsections of Connecting (SSH)

Installing SSH Clients

Linux

Mac

Windows

OpenSSH (Windows 10 or newer)

MobaXterm

PuTTY

Generating SSH Keys

SSH Key Basics

Generate SSH Key

MobaXterm

OpenSSH in Terminal (Linux, Mac, Windows PowerShell)

PuTTY

PuTTYgen ed25519

Uploading SSH Keys

Copy SSH Key

Public key in PuTTYgen

Upload Key

Academic Cloud Upload

Academic Cloud Landing Page

Academic Cloud Login Page

Academic Cloud Main Page

Academic Cloud Account Settings

HLRN Upload

Deleting lost/stolen SSH keys

Configuring SSH

OpenSSH, sftp, rsync, VSCode, …

Config file location

OpenSSH config file format

Simple configuration examples

Advanced configuration examples

Logging In

The Login Nodes

Names and Aliases

SSH key fingerprints

Example Logins with OpenSSH

Logging into an Emmy Phase 3 login node

Logging into a Grete (GPU) login node

Logging into a legacy SCC login node

Logging into a specific node

SSH Troubleshooting

Troubleshooting

External helpful articles and solutions

Subsections of SSH Troubleshooting

"Corrupted MAC" errors on Windows

Reduce the priority of the buggy MAC algorithm

Error: "permission denied"

Check that Your Username Is Correct

Member of a Project from the Project Portal

Remote development over SSH