Data Sharing

This page describes how you can collaboratively work on data within the HPC system and make it available to other users or to the outside world.

When running the show-quota command, you will see all data stores your current user has access to, divided into “User” and “Project” stores. Every member of your HPC Project Portal project can access the same project stores you see listed there. You can configure access for other project members to files owned by you via POSIX group permissions. If you are not familiar with the POSIX permission model, please take a look at our introductory course to learn the basics.

Some general advice and tips on how to best manage your files:

  • Use a capital X when doing recursive chmod operations. This makes sure you’re only making directories and files executable for the group or others that are also executable by the owner. It is generally a very bad idea to mark files as executable that are not supposed to be! That would be confusing in the best case and a potential security risk and risk to your data in the worst. To avoid that, use e.g.
chmod -R g+rwX example_dir
  • Set the SGID-bit on group-writable directories. This will cause newly created subdirectories to be owned by the same group as the parent, instead of the primary group of the user that created it (which would almost never be useful).
chmod g+s example_dir
  • Set the sticky bit on a directory if you want others to be able to create new files and subdirectories in it, but forbid editing or deleting files owned by someone else:
chmod +t example_dir

Applying for a project

If you are planning to closely collaborate with people from across multiple working groups or invite external collaborators for a specific research topic (that should not have access to all your working groups data), consider applying for a dedicated project. An HPC project will include a shared project directory and allows for dynamically inviting users via the HPC Project Portal. In order to apply as an SCC user, see Applying for SCC projects, otherwise contact our support.

Applying for a group

If a dedicated project is not feasible, it is possible to apply for a POSIX group by contacting our support. Please provide a good reason why you need a POSIX group, a unique group name and a list of usernames you want to be members of the new group.

Using a hidden directory

This is more of a temporary measure, and less secure than other methods, but quick and easy to do. The data will be available for all users that know the path to it. You should send the full path only to people you want to give access.

Warning

This method will leave all files and directories with non-empty other permissions under your chosen parent directory, especially those with more predictable names, open to all users of the HPC cluster!

Make sure you understand the basics of the POSIX permission model before attempting it!

While the top-level home or project directories do not have other permissions set by default, many files and subdirectories will likely have. Make sure to unset those before, for example:

chmod -R o= /mnt/vast-nhr/home/jdoe/u12345

Depending on the owning group, it may or may not be advisable to include g in the above command. Read this section to the end (especially the last info box below) for more context.

You have been warned!

Remember that in order to access a file or directory, one needs to be able to access all its parent directories as well. For directories, the read permission decides whether you can list its contents, the write permission decides if you can create or delete files in it, and the execute permission decides whether you can cd into it (or access files and subdirectories if their respective permissions allow for it). The trick is to make one of the parent directories executable only, but not readable.

Info

In this example, we will use the home directory, but you can apply the steps similarly to any other data store. Make sure to use the full canonical path (“real path”) to your directory to avoid confusion!

[nhr_ni_test] u12345@glogin4 ~ $ pwd
/user/jdoe/u12345
[nhr_ni_test] u12345@glogin4 ~ $ realpath .
/mnt/vast-nhr/home/jdoe/u12345

As you can see, the real path is different from the apparent path.

First, create a directory with a random name:

SHAREDIR=$(mktemp -p /mnt/vast-nhr/home/jdoe/u12345 -d share.XXXXXXXX)

This will create a directory with a random name in /mnt/vast-nhr/home/jdoe/u12345 and save the path in the variable SHAREDIR. You can now place the files you want to share in that directory. (Use tab-completion to avoid having to remember the random name or do e.g. cp some_file $SHAREDIR/)

Next, you need to set permissions to the directory:

chmod -R go+rX $SHAREDIR

And make sure the parent directory is not readable, only executable:

chmod go=x $SHAREDIR/..

(The above is equivalent to chmod go=x /mnt/vast-nhr/home/jdoe/u12345 in this example, but works regardless of where you chose to create the “hidden” directory.)

Info

We are changing group as well as other permissions here. This assumes your parent directory is owned by your user’s primary group, which is the case for most home directories by default. These primary groups are most likely very large groups (including everyone from your institution), so chances are high that some of the users you would like to share data with are members of the same primary group, but not all of them. Group permissions take precedence over other permissions, so just setting those permissions for others could exclude users that share your primary group.

If you have changed the owning group of your directory to your HPC_u_<academicid> group or you placed the hidden directory under a project directory, please remove the g from the last two commands. For example: chmod o=x $SHAREDIR/..

In such cases, leave the group permissions untouched or feel free to set them however you prefer.

Your “hidden” directory is now ready, you can send the path to any other users you want to share the data with. To print the path of the shared directory, run:

echo $SHAREDIR

Users who know the path can cd into it and copy the files to their own directories. When the share is no longer needed, don’t forget to unset execute permission of the parent directory to restrict access to any other subdirectories or files again.

chmod go= /mnt/vast-nhr/home/jdoe/u12345

Using ACLs

ACLs (Access Control Lists) are a more advanced, but also more complex method of defining fine-grained permissions in addition to the traditional POSIX permissions. They work on most filesystems, but not all of them, and are not immediately visible and thus easier to forget or make mistakes with. ACLs should be preferred over the “hidden” directory method above when you want to share a directory long-term and to a smaller number of people. We still recommend to only use them when other methods (like a common project or POSIX group) are not available.

How to set and manage ACLs on your directories is described in our Data Migration Guide.

Using S3

In order to use S3, you have to apply for an S3-Bucket by writing an email to support@gwdg.de and asking for one that is accessible from the HPC system. You can then share your secret key and private key within your group to give everyone access. In this scenario, access to your data is done via http, and it is reachable not only from the HPC system, but also the Cloud and Internet (if needed).

You can access your S3-Bucket from a compute node using https://s3.gwdg.de as an endpoint.

In order to work with your S3-Bucket, you could for instance use rclone:

module load rclone
# List content of your Bucket
rclone ls <config-name>:<bucket-name>/<prefix>
# Or Snyc the Content of your $HOME with the Bucket
rclone sync -i $HOME/some/folder <config-name>:<bucket-name>/<prefix>
# Or Snyc the Content of your Bucket with your $HOME
rclone sync -i <config-name>:<bucket-name>/<prefix> $HOME/some/folder

This requires a config file in ~/.config/rclone/rclone.conf with the following content:

[<config-name>]
type = s3
provider = Ceph
env_auth = false
access_key_id = <AccessKey>
secret_access_key = <SecretKey>
region =
endpoint = https://s3.gwdg.de
location_constraint =
acl =
server_side_encryption =
storage_class