GPFS File Systems in the Jülich Environment
All user-accessable file systems on the supercomputer systems (e.g. JUWELS, JURECA), the community cluster systems (e.g. JUAMS, JUZEA-1, etc), and the Data Access System (JUDAC) are provided via Multi-Cluster GPFS from the HPC-fileserver JUST.
The storage locations assigned to each user in the system environment are encapsulated with the help of shell environment variables (see table). The user’s directory in each file system is shared for all systems the user has granted access. It is recommended to organize the data by system architecture specific subdirectories.
The following file systems are available (Login Node or Compute Nodes):
File System |
Usable Space |
Blocksize |
Description |
Backup |
HPC system Access |
---|---|---|---|---|---|
|
49 TB |
1 MB |
Full path to the user’s home directory inside GPFS
|
ISP (to tape) |
Login + Compute |
|
9.1 PB |
8 MB |
Full path to the compute project’s standard scratch directory inside GPFS
|
no |
Login + Compute |
|
4.1 PB |
4 MB |
Full path to the compute project’s standard directory inside GPFS
|
ISP (to tape) |
Login + Compute |
|
27 PB |
8 MB |
Full path to the data project’s standard directory inside GPFS
|
snapshot ISP (to tape) |
Login + Computes |
|
17 PB |
4 MB |
Full path to data project directory inside GPFS (deprecated)
|
snapshot ISP (to tape) |
Login + special Computes (JUWELS, JURECA-DC, JUSUF) |
|
2.3 PB |
2 MB |
Full path to data project’s archive directory inside GPFS
|
ISP (to tape) |
Login only |
All variables will be set during the login process by /etc/profile
. It is highly recommended to access files always with the help of
these variables.
- Details about the different file systems can be found in
- Details on naming conventions and access right rules for FZJ file systems are given in
- File system resources will be controlled by quota policy for each group/project. For more information see
- An example on how to use largedata ($DATA) within a batchjob can be found in
How to access largedata on a limited number of computes within your jobs?
Best practice notes
- Avoid a lot of small filesNumerousness small files should be reorganized within tar-archives to avoid long access times due to deficiencies in file processing of the underlying operating system.
- Avoid renaming of directoriesWithin all file systems offering a backup (excluding $SCRATCH), a rename of directories within the data path should be done carefully because all data beyond the changed directory must be backed up once again. If a large amount of data is affected, it prevents backup of really new data in the entire file system and/or costs precious system resources like CPU time and storage capacity.