JUSUF user documentation
Contents:
JUSUF Cluster Partition
Access
Prerequisites
SSH Login
Restrictions
OpenSSH
OpenSSH Installation
OpenSSH Key Generation
Key Upload, Key Restriction
Logging in to JUSUF
Key Agent
OpenSSH Persistent Configuration
X Forwarding
Troubleshooting
Further Reading
PuTTY
PuTTY Installation
PuTTY Key Generation
PuTTY Persistent Configuration
Login Nodes
Example
Alternative Login Methods
Configuration
Hardware Configuration
Software Overview
Environment
Shell
Active project
Available file systems
File systems for compute projects
Home directory ($HOME)
Project directory ($PROJECT)
Working directory ($SCRATCH)
File systems for data projects
Data directory ($FASTDATA)
Data directory ($DATA)
Archive directory ($ARCHIVE)
Machine identification file
Transferring files with
scp
,
rsync
, etc.
Using Git on JUSUF
Software Modules
Basic module usage
Available compilers
MPI runtimes
GPUs and modules
Finding software packages
Stages
Stages Changelog
Scientific software at JSC
Requesting new software
Installing your own software with EasyBuild
Building Software
Compiled Languages
Compilers
MPI Compiler Wrappers
CUDA
Build Systems
Exotic Languages
Interpreted Languages
Python
Julia
Batch system
Slurm Partitions
Hardware Overview
Available Partitions
Allocations, Jobs and Job Steps
Writing a Batch Script
Job Script Examples
Requesting Generic Resources and Features
Job Steps
Dependency Chains
Interactive Sessions
Hold and Release Batch Jobs
Slurm commands
Summary of sbatch and srun Options
Frequency Scaling Performance Reliability
Processor Affinity
Slurm options
Terminology
--cpu-bind
Implicit types
Explicit types
--distribution
First part (
node_level
)
Second part (
socket_level
)
Third part (
core_level
)
Fourth part
--hint
Affinity examples
Default processor affinity
Further examples
Affinity visualisation
Differences to vanilla Slurm (19.05)
MPMD: Multiple Program Multiple Data Execution Model
GPU Computing
JUSUF GPU Nodes
GPU Visibility/Affinity
Nvidia Profiling Tools and Clock Speed
Job Script Examples
GPFS File Systems in the Jülich Environment
Best practice notes
Data Transfer to and from JUSUF
Heterogeneous and Cross-Module Jobs
Heterogeneous Jobs
Specifying Individual Job Options
Running Job Components Side by Side
Loading Software in a Heterogeneous Environment
Uniform Architecture and Dependencies
Non Uniform Architectures and Mutual Exclusive Dependencies
Accounting
Accounting Mode
Command description
User tool jutil
Usage
Available subcommands
Available actions
Available options
Allowed user interfaces
Container Runtime on JUSUF
What Containers Provide
Getting Access
Apptainer on JUSUF
Backwards compatibility to Singularity
Apptainer Images
Launching Containers via Slurm
Container Build System
Building Container Images via CLI
Container Build System REST API
UNICORE in Production
JUSUF Cloud Partition
Quick Introduction
Usage Model
Virtual Machine Types
Storage Layers
Access to JUSUF Cloud
Configuration
Hardware Configuration
Software Overview
Accessing The Cloud
Prerequisites
OpenStack API Endpoints
WebUI
Fenix AAI
JUDOOR
Command Line Interface (CLI)
Fenix AAI
JUDOOR
First Steps
Accessing VMs
Network Setup
Create and manage networks
Create a network
WebUI
Command Line Interface
Create a router
WebUI
Command Line Interface
Generic OpenStack documentation
Security Groups
WebUI
Command Line Interface
Accessing Virtual GPUs
Prerequesits
Install Nvidia driver for vGPU support
Using NVMEs
Prerequesits
Access/Mount the NVMe
Accessing the DATA file system
Known Issues on JUSUF
Open Issues
Recently Resolved and Closed Issues
FAQ
General FAQ
How to generate and upload ssh keys?
My job failed with “Transport retry count exceeded”
My job failed/was killed for no apparent reason
FAQ about Data Management
How to access largedata on a limited number of computes within your jobs?
What file system to use for different data?
What data quotas do exist and how to list usage?
How to modify the users’s environment.
How to make the currently enabled budget visible:
How can I recall migrated data?
How can I see which data is migrated?
How to restore files?
$HOME - Users personal data
$PROJECT - Compute project repository
$FASTDATA - Data project repository (bandwidth optimized)
$DATA - Data project repository (large capacity)
$ARCHIVE - The Archive data repository
How to share files by using ACLs?
Linux commands to manage ACLs
Which files have an access control list?
How to avoid multiple SSH connections on data transfer?
How to ensure correct group ID for Your project data?
Information about $DATA incident from January 2021
Maintenance
Support
JUSUF user documentation
JUSUF Cloud Partition
JUSUF Cloud Partition
Contents:
Quick Introduction
Usage Model
Virtual Machine Types
Storage Layers
Access to JUSUF Cloud
Configuration
Hardware Configuration
Software Overview
Accessing The Cloud
Prerequisites
OpenStack API Endpoints
WebUI
Command Line Interface (CLI)
First Steps
Accessing VMs
Network Setup
Security Groups
Accessing Virtual GPUs
Prerequesits
Install Nvidia driver for vGPU support
Using NVMEs
Prerequesits
Access/Mount the NVMe
Accessing the DATA file system