Configuration

Hardware Configuration of the JUWELS Cluster Module

  • 2271 standard compute nodes
    • 2× Intel Xeon Platinum 8168 CPU, 2× 24 cores, 2.7 GHz

    • 96 (12× 8) GB DDR4, 2666 MHz

    • InfiniBand EDR (Connect-X4)

    • Intel Hyperthreading Technology (Simultaneous Multithreading)

    • Diskless

  • 240 large memory compute nodes
    • 2x Intel Xeon Platinum 8168 CPU, 2× 24 cores, 2.7 GHz

    • 192 (12× 16) GB DDR4, 2666 MHz

    • InfiniBand EDR (Connect-X4)

    • Intel Hyperthreading Technology (Simultaneous Multithreading)

    • Diskless

  • 56 accelerated compute nodes
    • 2× Intel Xeon Gold 6148 CPU, 2× 20 cores, 2.4 GHz

    • 192 (12× 16) GB DDR4, 2666MHz

    • 2× InfiniBand EDR (Connect-X4)

    • Intel Hyperthreading Technology (Simultaneous Multithreading)

    • 4× NVIDIA V100 GPU, 16 GB HBM

    • Diskless

  • 12 login nodes
    • 2× Intel Xeon Gold 6148 CPU, 2× 20 cores, 2.4 GHz

    • 12× 64 GB DDR4, 2666MHz

    • InfiniBand EDR (Connect-X5)

    • Intel Hyperthreading Technology (Simultaneous Multithreading)

    • 100 GbE External connection

    • Local disks (120 GB available in /tmp)

  • 4 visualization nodes
    • 2× Intel Xeon Gold 6148 CPU, 2× 20 cores, 2,4 GHz

    • 768 (12× 64) GB DDR4, 2666 MHz

    • InfiniBand EDR (Connect-X5)

    • Intel Hyperthreading Technology (Simultaneous Multithreading)

    • 100 GbE External connection

    • 2x 1TB HDD (RAID 1)

    • 1× NVIDIA Pascal P100

    • Local disks (120 GB available in /tmp)

  • 122,768 CPU cores

  • 10.6 (CPU) + 1.7 (GPU) Petaflop per second peak performance

  • Mellanox InfiniBand EDR fat-tree network with 2:1 pruning at leaf level and top-level HDR switches
    • 5 TB/s connection to Booster

  • 250 GB/s network connection to JUST for storage access

Hardware Configuration of the JUWELS Booster Module

  • 936 compute nodes
    • 2× AMD EPYC Rome 7402 CPU, 2× 24 cores, 2.8 GHz

    • Simultaneous Multithreading

    • 512 GB DDR4, 3200 MHz

    • 4× NVIDIA A100 GPU, 40 GB HBM2e

    • 4× InfiniBand HDR (Connect-X6)

    • Diskless

  • 4 login nodes
    • 2× AMD EPYC Rome 7402 CPU, 2× 24 cores, 2.8 GHz

    • Simultaneous Multithreading

    • 512 GB DDR4, 3200 MHz

    • InfiniBand HDR (Connect-X6)

    • 100 GbE External connection

    • Local disks (120 GB available in /tmp)

  • 3,744 GPUs

  • 73 Petaflop per second peak performance

  • Mellanox InfiniBand HDR DragonFly+ topology with 20 cells - 5 TB/s connection to Cluster

  • 700 GB/s network connection to JUST for storage access

Software Overview

  • Rocky Linux 8 distribution

  • ParaStation Modulo

  • Slurm batch system with ParaStation resource management

  • Cluster software stack
    • Intel Professional Fortran, C/C++ Compiler
      • Support for OpenMP programming model for intra-node parallelization

    • Intel Math Kernel Library

    • ParTec ParaStation MPI (Message Passing Interface) implementation

    • Intel MPI

    • Open MPI

  • Booster software stack
    • CUDA-aware ParTec ParaStation MPI

    • Open MPI

    • NVIDIA HPC SDK

  • IBM Spectrum Scale (GPFS) parallel file system