Configuration
Hardware Configuration
- 144 standard compute nodes
2× AMD EPYC 7742, 2× 64 cores, 2.25 GHz
256 (16× 16) GB DDR4, 3200 MHz
InfiniBand HDR100 (Connect-X6)
local disk for operating system (1× 240 GB SSD)
1 TB NVMe
- 61 accelerated compute nodes
2× AMD EPYC 7742, 2× 64 cores, 2.25 GHz
256 (16× 16) GB DDR4, 3200 MHz
InfiniBand HDR100 (Connect-X6)
local disk for operating system (1× 240 GB SSD)
1 TB NVMe
1× NVIDIA V100 GPU with 16 GB HBM2e
- 4 login nodes
2× AMD EPYC 7742, 2× 64 cores, 2.25 GHz
256 (16× 16) GB DDR4, 3200 MHz
HDR 100-Infiniband (Connect-X6)
100 Gigabit Ethernet external connection
local disk for operating system (2× 480 GB SSD in RAID 1)
26,240 CPU cores
944.6 (CPU) + 427 (GPU) Teraflop per second peak performance
Mellanox InfiniBand HDR full fat-tree network with HDR100 speed on the nodes and full HDR on the inter-switch level
100 Gbit/s network connection per login node and 40 Gbit/s network connection per compute node to JUST for storage access
The compute nodes can be moved flexibly between the cluster and the cloud module.
Note
Cloud nodes do NOT have Infiniband. GPU nodes available for the cloud can host 4 vGPUs, allocated to one physical NVIDIA V100.
Software Overview
Rocky Linux 8 distribution
Parastation Cluster Management
ParTec ParaStation MPI (Message Passing Interface) implementation
Intel MPI
Open MPI
IBM Spectrum Scale (GPFS) parallel file system