Environment

This article describes the system environment, especially important file system locations and the concept of active project.

Shell

The login shell on all servers in the JUWELS Cluster is /bin/bash. The persistent settings of the shell environment are governed by the content of $HOME/.bashrc, $HOME/.profile or scripts sourced from within these files. Please use these files for storing your personal settings.

It is not possible to change the login shell but users may switch to a personal shell within the login process. However, please note that only bash is fully supported on JUWELS and usage of alternative shells may degenerate the user experience on the system.

Active project

Through their association with projects, users of JUWELS are given access to both computational resources (a core-h budget) and storage resources (write access to directories on certain file systems with associated quotas). The locations of these directories are exposed via environment variables, e.g. $PROJECT. However, since a user can be a member of several projects at the same time, the names of these environment variables have to contain the project name in order to avoid collisions: $PROJECT_<project name>. Using the jutil command line utility, a project can be made the active project:

$ jutil env activate -p <project>

All environment variables pointing to storage resources associated with <project> will be re-exported in their suffix-less form, e.g. $PROJECT_<project> will be re-exported as just $PROJECT. <project> will stay the active project until another project is made the active project or until the shell session ends. For more information look at the jutil command usage.

Available file systems

The available parallel file systems on JUWELS are mounted from JUST. The following table gives an overview over the available file systems:

Note

Except for $HOME the environment variables in the table below are only available once a project has been activated, see Active project. The project specific forms are always available, e.g. $PROJECT_<project>.

Variable

Storage Location

Accessibilty

Description

$HOME

parallel file system

Login + Compute

Storage of user specific data (e.g. ssh-key)

$PROJECT

parallel file system

Login + Compute

Storage of project related source code, binaries, etc.

$SCRATCH

parallel file system

Login + Compute

Scratch file system for temporary data

$CSCRATCH

flash cache system

Login + Compute

NVMe based cache layer for $SCRATCH file system

$FASTDATA

parallel file system

Login

Storage location for large data (JUSTDSS)

$DATA

parallel file system

Login

Storage location for large data (XCST)

$ARCHIVE

parallel file system

Login

Storage location for archiving on tape

It is highly recommended to access files always with help of these variables.

File systems for compute projects

Within the current usage model file systems are bound to compute or data projects. The following description is just an overview on how to use these file systems.

For further information, please see: What file system to use for different data?

Each compute project has access to the following file systems.

Home directory ($HOME)

Home directories reside in the parallel file system. In order to hide the details of the home file system layout the full path to the home directory of each user is stored in the shell environment variable $HOME. References to files in the home directory should always be made through the $HOME environment variable. The initialization of $HOME will be performed during the login process.

The Home directory is limited in space and should only be used for storing smal user-specific data items (e.g. ssh-keys, configuration files).

Project directory ($PROJECT)

Project directories reside in the parallel file system, too. In order to hide the details of the project file system layout the full path to these directories is stored in shell environment variables.

As an account can be bound to several projects the variables are marked accordingly $PROJECT_<project>. The data migrated at the transition to the new usage model was moved to the user-owned subdirectory $PROJECT_<project>/account. Please note that the project directory itself is writable for the project members, i.e. different organization schemes within the project (for example to enable easier sharing of data) are possible and entirely in the hand of the project PI and members.

To activate a certain project for a current session or switch between projects one can use the tool jutil.

During activation of a project environment variables will be exported and the environment variable $PROJECT is set to $PROJECT_<project>.

This tool can also be used to perform tasks like querying project information/cpu and data quota.

Working directory ($SCRATCH)

Scratch directories are temporary storage locations residing in the parallel file system. They are used for applications with large size and I/O demands. Data in these directories are only for temporary use, they are automatically deleted (files after 90 days by modification and access date, empty directories after 3 days). The structure of the scratch directory and the corresponding environment variables are similar to the project directory.

File systems for data projects

File systems for data projects are used to store large data. The structure and environment variables are similar to $PROJECT and $SCRATCH. Data projects have to be explicitly applied for and are independent from compute projects.

Data directory ($FASTDATA)

Fastdata directories are used for applications with large data and I/O demands similar to the scratch file system.

Contrary to $SCRATCH data in $FASTDATA is permanent and protected with snapshots.

Data directory ($DATA)

Data directories are used to store a huge amount of data on disk based storage. The bandwidth is lower than in $FASTDATA. Access to these directories is available from login nodes only.

Archive directory ($ARCHIVE)

Archive directories are used to store all files not in use for a longer time; data are migrated to tape storage by ISP-HSM (IBM Spectrum Protect for Space Management).

Machine identification file

To simplify users the handling of the shared $HOME file system on the different supercomputers JSC provides a machine identification file /etc/FZJ/systemname on all systems. /etc/FZJ/systemname stores the system name (such as juwels, jureca, jusuf, jedi,…) and can be used perform system specific actions without the need to parse the hostname of the login or compute nodes.

Below an example for the handling of different machines e.g. in .profile or .bashrc is provided:

MACHINE=$(cat /etc/FZJ/systemname)
if test "${MACHINE}" = "juwels"; then
# perform JUWELS specific acctions
elif test "${MACHINE}" = "jureca" ; then
# perform JURECA specific actions
fi

The machine name can also be read within a Makefile using:

$(shell cat /etc/FZJ/systemname)

Transferring files with scp, rsync, etc.

Since outgoing SSH connections are not allowed, file transfers to and from JUWELS which use SSH as the underlying transport have to be initiated from the other system. So instead of

juwels$ scp my_file local:

you have to initiate the copy from the local system:

local$ scp juwels.fz-juelich.de:my_file .

In some cases, it might not be possible to directly transfer files between another system and JUWELS. This might be, because the other system also disallows outgoing SSH connections, or the SSH client on the other system is too old and does not support the modern cryptographic algorithms required by JSC policy. As a work around, the files have to be transferred to a third system which can make connections to both JUWELS and the other system. This can be automated with scp and its command line argument -3:

local$ scp -3 other.hpc.example.com:my_file juwels.fz-juelich.de:

Note

An internet connection that provides fast speeds in both the upload and download direction is recommended for this approach.

Using Git on JUWELS

Since outgoing SSH connections are not allowed on JUWELS and SSH authentication is not possible, it can be quite challenging to clone a Git repository. Here are some alternatives and possible workarounds.

  • Use https instead of ssh for Git:

    juwels$ git clone https://github.com/libgit2/libgit2 ~/libgit2/
    

    Existing repositories can be changed by modifying the URL of the remote: git remote set-url origin https://github.com/libgit2/libgit2.

    Instead of authenticating using username and password of the Git hosting service for a http-based checkout, we strongly recommend using Personal Access Tokens. They can be configured with limited permissions (only push and pull, only pull/no push, …) and don’t allow for accessing your full account on the Git hosting service. They can be configured through the website of the respective Git hosting service. With access tokens, Git Credential Helpers can be useful.

    Note

    For users of the JSC Gitlab, Access Tokens are even more strongly recommended (as the JSC Gitlab uses the same LDAP credentials as JuDoor).

  • Use SSHFS:

    • On your local machine, install SSHFS (it might be there already)

    • Mount a directory of JUWELS to your local machine; you can access files and directories as if they were local, but they actually still live on JUWELS and are only copied on demand

      local$ sshfs juwels:~/ssh-fs/ ~/juwels-mount/
      
    • Clone repository into your local directory which will automatically upload each file and directory to JUWELS

      local$ git clone git@github.com:libgit2/libgit2.git ~/juwels-mount/
      
    • Attention: This is slow! Have a look at the mounting options mentioned in the man page to potentially speed-up the process.

  • Use a bare Git repository on JUWELS as a proxy:

    • Create bare repository on JUWELS:

      juwels$ git init --bare ~/.my-bare-repo.git
      
    • Locally, clone the original Git repository, add a new remote (the bare repository), and push everything:

      local$ git clone https://github.com/octocat/Hello-World && cd Hello-World
      local$ git remote add mirror-juwels juwels:~/.my-bare-repo.git/
      local$ git push mirror-juwels master
      
    • On JUWELS, now create a proper (non-bare) repository based on the bare repository, add some changes as an example, and push to the bare repository:

      juwels$ git clone ~/.my-bare-repo.git Hello-World && cd Hello-World
      juwels$ touch a-new-file && git add a-new-file && git commit -m "A new file"
      juwels$ git push origin master
      
    • Finally, get the changes from JUWELS back to your local copy and push them to the original origin:

      local$ git pull mirror-juwels master
      local$ git push origin master
      
  • Check out the Git repository locally, use rsync to two-way-sync to/from JUWELS

  • Note on Submodules: Git repositories can have other Git repositories as external dependencies; so-called submodules. Since the address of the submodule is hard-coded into .gitmodules, it might include incompatible ssh:// URLs. In most cases, the ssh:// part can be replaced with https:// (as shown above) within the .gitmodules modules. Changes within this file need to be transported to the .git configuration directory by calling git submodule sync.