Changelog

Current state

Installed software

Software

Version

Description

Rocky Linux

9.6

Kernel Version

5.14.0-570.32.1.el9_6

NVIDIA GPU Driver

580.65.06

OFED

25.04-OFED.25.04.0.6.0.1

Slurm

24.11.6-1.20250807git03d01a9

ParaStation Management

6.4.1

GPFS

5.2.3-2

Apptainer

1.4.1-1

PMIx

5.0.8

Default Software Stage

2025

Changelog entries

2025-09-22 Update UCX

Update type: SW Modules

  • UCX has been changed to 1.18.1 from 1.17.0

2025-09-09 Software update

Update type: OS Packages and SW Modules

OS Packages
  • Rocky Linux has been updated to 9.6 (from 9.5)

  • Kernel Version has been updated to 5.14.0-570.32.1.el9_6 (from 5.14.0-503.40.1.el9_5)

  • NVIDIA GPU Driver has been updated to 580.65.06 (from 570.133.20)

  • Slurm has been updated to 24.11.6-1.20250807git03d01a9 (from 24.11.5-1.20250602git2ed9014)

  • ParaStation Management has been updated to 6.4.1 (from 6.3.0)

  • GPFS has been updated to 5.2.3-2 (from 5.2.2-1.12)

  • Apptainer has been updated to 1.4.1-1 (from 1.3.6-1)

UCX-settings
  • UCX_CUDA_COPY_DMABUF=no has been removed for the UCX-settings/[RC,UD,DC]-CUDA modules, since it is no longer necessary to prevent crashes, and it actually causes a performance regression with the latest OFED and NVIDIA driver

2025-07-24 Software update

Update type: OS Packages

OS Packages
  • ParaStation Management has been updated to 6.3.0 (from 6.2.3)

2025-06-24 Software update

Update type: OS Packages and Firmware

Firmware
  • ConnectX-6 HCAs have been updated to firmware version 20.43.2566

OS Packages
  • Kernel Version has been updated to 5.14.0-503.40.1.el9_5 (from 5.14.0-503.38.1.el9_5)

  • OFED has been updated to 25.04-OFED.25.04.0.6.0.1 (from 25.01-OFED.25.01.0.6.0.1)

  • Slurm has been updated to 24.11.5-1.20250602git2ed9014 (from 23.11.10-1.20240920git20c5755)

  • GPFS has been updated to 5.2.2-1.12 (from 5.2.2-1)

  • ParaStation Management has been updated to 6.2.3 (from 6.1.1)

  • PMIx has been updated to 5.0.8 (from 5.0.6)

2025-04-29 Software update

Update type: OS Packages

OS Packages
  • Kernel Version has been updated to 5.14.0-503.38.1.el9_5 (from 5.14.0-503.23.1.el9_5)

  • NVIDIA GPU Driver has been updated to 570.133.20 (from 570.86.15)

2025-03-20 Software update

Update type: OS Packages, SLURM configuration

OS Packages
  • Rocky Linux has been updated to 9.5 (from 9.4)

  • Kernel Version has been updated to 5.14.0-503.23.1.el9_5 (from 5.14.0-427.33.1.el9_4)

  • NVIDIA GPU Driver has been updated to 570.86.15 (from 560.35.03)

  • OFED has been updated to 25.01-OFED.25.01.0.6.0.1 (from 24.07-OFED.24.07.0.6.1.1)

  • GPFS has been updated to 5.2.2-1 (from 5.1.9-4)

  • PMIx has been updated to 5.0.6 (from 4.2.9)

SLURM Configuration
  • Cgroup constraints have been enabled for (GPU) devices, jobsteps can access just the requested GPUs

Update type: SW Modules

  • OpenMPI has been recompiled to incorporate this patch

2025-02-27 MemoryMax

Update type: Login nodes

  • MemoryMax has been set to 25% on individual user slices on login nodes

2025-02-05 Change MPI-settings for OpenMPI

Update type: SW Modules

  • As of 2025 romio321 is not working, so we have disabled the selection of romio321 in the MPI-settings, giving OpenMPI the freedom to choose and prioritize, currently ompio is selected.

2025-01-15 Default UCX-settings module

Update type: SW Modules

  • RC-CUDA has been made the default module for UCX-settings in the 2025 stage. Until now it was UD by mistake.

2024-12-18 Software update

Update type: OS Packages

OS Packages
  • ParaStation Management has been updated to 5.1.63 (from 5.1.62)

2024-12-11 Software update

Update type: OS Packages

OS Packages
  • Apptainer has been updated to 1.3.6-1 (from 1.3.2-1)

2024-10-30 Software update

Update type: OS Packages

OS Packages
  • Rocky Linux has been updated to 9.4 (from 8.10)

  • Kernel Version has been updated to 5.14.0-427.33.1.el9_4 (from 4.18.0-553.el8_10)

  • NVIDIA GPU Driver has been updated to 560.35.03 (from 550.54.15)

  • OFED has been updated to 24.07-OFED.24.07.0.6.1.1 (from 24.04-OFED.24.04.0.6.6.1)

  • Slurm has been updated to 23.11.10-1.20240920git20c5755 (from 23.02.7-1.20240328git405c820)

  • ParaStation Management has been updated to 5.1.62 (from 5.1.61)

  • GPFS has been updated to 5.1.9-4 (from 5.1.9-3)

2024-08-08 Software update

Update type: OS Packages, Network

OS Packages
  • Slurm has been updated to 23.02.7-1.20240328git405c820 (from 22.05.11-1.20231215gitc756517)

  • ParaStation Management has been updated to 5.1.61 (from 5.1.59)

  • PMIx has been updated to 4.2.9 (from 4.2.7)

Network
  • The Skyway firmware have been updated to 8.2.2302

2024-06-14 Subnet Manager Update (Damian Alvarez)

Update type: Network

  • Subnet Manager updated to mlnx_ib_mgmt-5.19.1

2024-06-14 Software update

Update type: OS Packages

OS Packages
  • Rocky Linux has been updated to 8.10 (from 8.9)

  • Kernel Version has been updated to 4.18.0-553.el8_10 (from 4.18.0-513.18.1.el8_9)

  • NVIDIA GPU Driver has been updated to 550.54.15 (from 535.154.05)

  • OFED has been updated to 24.04-OFED.24.04.0.6.6.1 (from 23.10-OFED.23.10.1.1.9.1)

  • GPFS has been updated to 5.1.9-3 (from 5.1.9-1)

  • Apptainer has been updated to 1.3.2-1 (from 1.2.3-1)

2024-01-16 Software update (Damian Alvarez)

Update type: OS Packages, Batch system, SW Modules

OS Packages:
  • General update to Rocky 8.9

  • SLURM has been updated to 22.05.11-1.20231215gitc756517 (from 22.05.10-2.20231203gitae058ea)

  • psmgmt has been updated to 5.1.59-1 (from 5.1.58-1).

  • Kernel 4.18.0-513.11.1.el8_9 (from 4.18.0-477.27.1.el8_8.x86_64)

  • NVIDIA OFED 23.10-1.1.9.1 (from 23.07-0.5.1.2)

  • NVIDIA GPU drivers 535.129.03 (from 535.104.12)

  • GPFS 5.1.9-1 (from 5.1.8-2)

  • DDN IME 1.5.2-152129 (from 1.5.2-152128)

HCA FW
  • CX6 cards have been updated to 20.39.2048

Software stack
  • UCX-settings loads now RC by default in JUWELS Cluster. Before it was mistakenly loading UD

2023-12-14 Software update (Damian Alvarez)

Update type: OS Packages, Batch system, SW Modules

OS Packages:
  • SLURM has been updated to 22.05.10-2.20231203gitae058ea to address newly-discovered security issues

  • psmgmt has been updated to 5.1.58-1

Software stack
  • netCDF in the 2024 stage has been rebuilt to add support for extra compression libraries

  • GCC in the 2024 stage has been recompiled to patch some bugs that appeared in combination with PyTorch

2023-10-30 PMIx update (Sebastian Achilles)

Update type: OS Packages

Packages:
  • PMIx 4.2.6

Configuration:
  • All OpenMPI installations have been rebuilt to include a patch necessary for the new PMIx

2023-10-19 Software update (Damian Alvarez)

Update type: OS Packages, Batch system

Packages:
  • Kernel 4.18.0-477.27.1.el8_8.x86_64

  • NVIDIA OFED 23.07-0.5.1.2

  • NVIDIA GPU drivers 535.104.12

  • GPFS 5.1.8-2

  • Apptainer 1.2.3-1

  • DDN IME 1.5.2-152128

  • psmgmt-5.1.56-2

  • IB Switch firmware 27.2012.1010

Configuration:
  • ssh rejects now RSA keys

  • All OpenMPI installations rely now on a user-space provided PMIx

2023-08-30 UCX-settings update (Damian Alvarez, JSC)

Update type: SW Modules

The UCX-settings/*CUDA modules also set UCX_RNDV_FRAG_MEM_TYPE=cuda. This enables the GPU to initiate transfers of CUDA managed buffers. This can have a large speed-up in case Unified Memory (cudaMallocManaged()) is used, as staging of data is avoided.

2023-08-10 General maintenance/update (Damian Alvarez, JSC)

Update type: OS Packages, General configuration, Storage, Network, Other

Compute nodes update

The compute nodes are updated to:

  • Rocky 8.8 (from 8.7)

  • MOFED 23.04-OFED.23.04.1.1.3.1 (from 5.8-OFED.5.8.2.0.3.1)

  • GPFS 5.1.8-1 (from 5.1.7-1.5)

  • NVIDIA driver 535.54.03 (from 525.105.17)

  • psmgmt 5.1.56-2 (from 5.1.56-1)

2023-08-03 General maintenance/update (Damian Alvarez, JSC)

Update type: OS Packages, General configuration, Storage, Network, Other

Login nodes update

The login nodes are updated to:

  • Rocky 8.8 (from 8.7)

  • MOFED 23.04-OFED.23.04.1.1.3.1 (from 5.8-OFED.5.8.2.0.3.1)

  • GPFS 5.1.8-1 (from 5.1.7-1.5)

  • NVIDIA driver 535.54.03 (from 525.105.17)

2023-08-01 TS update, psmgmt update (Damian Alvarez, JSC)

Update type: OS Packages, Batch system, Other

  • The nodes in JUWELS Booster have been updated to psmgmt-5.1.56-1.

  • Racks [21,29,31-39] in JUWELS Booster have been updated to technical state 068.03

2023-07-31 TS Update (Damian Alvarez, JSC)

Update type: Other

Racks [22-28,30] in JUWELS Booster have been updated to technical state 068.03

2023-07-27 TS update, psmgmt update (Damian Alvarez, JSC)

Update type: OS Packages, Batch system, Other

  • The nodes in JUWELS Cluster have been updated to psmgmt-5.1.56-1. The nodes in JUWELS Booster will be updated on 2023-08-01

  • Racks [11-20] in JUWELS Booster have been updated to technical state 068.03

2023-07-26 TS Update (Damian Alvarez, JSC)

Update type: Other

Racks [04-10] in JUWELS Booster have been updated to technical state 068.03

2023-07-25 TS Update (Damian Alvarez, JSC)

Update type: Other

Racks [01-03] in JUWELS Booster have been updated to technical state 068.03

2023-05-23 – 2023-06-19 Rolling update (Ahmed Fahmy, JSC)

Top island nodes have been updated to the following versions:

  • Kernel 4.18.0-425.19.2.el8_7 (from 4.18.0-425.13.1.el8_7)

  • OFED 5.8-2.0.3.1 (from 5.8-1.1.2.1)

  • GPFS 5.1.7-1.5 (from 5.1.7-0)

  • NVIDIA driver 525.105.17 (from 525.85.12)

  • Apptainer 1.1.8-1 (from 1.1.5-1)

  • LXC 5.0.0-1 (from 3.0.4-2)

  • Slurm 22.05.9-1 (from 22.05.8-1)

  • Slurm Plugins 2.1 (from 2.0)

2023-05-25 – 2023-05-26 Rolling update (Damian Alvarez, JSC)

Update type: OS Packages, Storage

Compute nodes have been updated to the following versions:

  • Kernel 4.18.0-425.19.2.el8_7 (from 4.18.0-425.13.1.el8_7)

  • OFED 5.8-2.0.3.1 (from 5.8-1.1.2.1)

  • GPFS 5.1.7-1.5 (from 5.1.7-0)

  • NVIDIA driver 525.105.17 (from 525.85.12)

  • Apptainer 1.1.8-1 (from 1.1.5-1)

2023-05-23 Emergency maintenance/update (Damian Alvarez, JSC)

Update type: Maintenance, OS Packages, Storage

During an storage outage the jwlogin[01-06,10-11,22],jwvis[00-03] nodes have been updated to the following versions:

  • Kernel 4.18.0-425.19.2.el8_7 (from 4.18.0-425.13.1.el8_7)

  • OFED 5.8-2.0.3.1 (from 5.8-1.1.2.1)

  • GPFS 5.1.7-1.5 (from 5.1.7-0)

  • NVIDIA driver 525.105.17 (from 525.85.12)

  • Apptainer 1.1.8-1 (from 1.1.5-1)

2023-03-09 Emergency maintenance/update (Damian Alvarez, JSC)

Update type: Maintenance, OS Packages, Storage

GPFS software upgrade

GPFS has been updated everywhere to:

  • GPFS 5.1.7-0 (from 5.1.6-1)

2023-02-28 General maintenance/update (Damian Alvarez, JSC)

Update type: Maintenance, SW Modules, Batch system, OS Packages, Firmware

Stage Update:

The default software stack has been changed to 2023. The remaining software stages are nevertheless reachable.

Slurm Update:

Slurm has been updated to version 22.05.

Software Updates:
  • OFED 5.8-1.1.2.1

  • GPFS 5.1.6-1 (from 5.1.4-1)

  • IME 1.5.2-152111 (from 1.5.2-152065)

  • NVIDIA driver 525.85.12 (from 515.65.07-1)

  • Apptainer 1.1.6-1 (from 1.1.3-1)

  • psmgmt 5.1.53-1 (from 5.1.52-5)

Firmware Updates:
  • HDR Infiniband switches firmware 27.2010.5042

  • EDR Infiniband switches firmware 15.2010.5042

  • HDR Infiniband HCA firmware 20.36.1010

2022-12-09 Emergency maintenance/update (Damian Alvarez, JSC)

Update type: Maintenance, OS Packages, Storage

Compute nodes software downgrade

The compute nodes have been downgraded to:

  • GPFS 5.1.4-1 (from 5.1.5-1.10)

2022-12-05 Emergency maintenance/update (Damian Alvarez, JSC)

Update type: Maintenance, OS Packages, Network, Other

Compute nodes software update

The compute nodes are updated to:

  • MOFED 5.8-1.1.2.1 (from 5.8-1.0.1.1)

InfiniBand Firmware updates

The following components in the InfiniBand network are updated:

  • Unmanaged Quantum based switches are updated to 27.2010.4102 (from 27.2010.3118)

  • Managed Quantum based switches are updated to 27.2010.4034 (from 27.2010.3118)

  • Switch-IB 2 based switches are updated to 15.2010.4102 (from 15.2010.3118)

2022-11-29 General maintenance/update (Damian Alvarez, JSC)

Update type: Maintenance, Announcement, OS Packages, General configuration, Batch system, Storage, Network, Other

Compute nodes software update

The compute nodes are updated to:

  • Rocky 8.7 (from 8.6)

  • MOFED 5.8-1.0.1.1 (from 5.7-1.0.2)

  • GPFS 5.1.5-1.10 (from 5.1.4-1)

  • NVIDIA driver 515.65.07-1 (from 515.65.01-1)

  • Apptainer 1.1.3-1 (from 1.0.3-1)

  • psmgmt 5.1.52-5 (from 5.1.50-4)

SHARP enablement

SHARP has been enabled in a subset of nodes on JUWELS Booster. For more information please contact HPS.

Skyway configuration

The skyway gateways have been configured in HA pairs, with 4 extra skyways being taken into production. As a side effect, extra bandwidth between JUWELS Booster and the JUST storage is now available.

2022-10-18 Cooling maintenance (Damian Alvarez, JSC)

Update type: Maintenance, Batch system, Storage, Network

New SLURM plugins available
  • cpufreq and gpufreq plugins are now available in JUWELS

New firmware version for Skyways
  • Updated to 8.1.3000

2022-10-12 psslurm change during unplanned downtime (Damian Alvarez, JSC)

Update type: Batch system, Other

The ENABLE_FPE_EXCEPTION option in psslurm.conf has been disabled as a response to applications crashing with underflow floating point exceptions being sent/forwarded by psid.

2022-09-07 Small update during unplanned downtime (Damian Alvarez, JSC)

Update type: Maintenance, Batch system, Network, Other

Compute nodes software update

The compute nodes are updated to:

  • psmgmt 5.1.50-5 (from 5.1.50-4). This corrects a bug in the PMIx server that had effects on MPI_Comm_split_type on OpenMPI and therefore Horovod too

InfiniBand Firmware updates

The following components in the compute infrastructure are updated:

  • ConnectX-4 HCAs are downgraded to 12.28.2006 (from 12.32.1010), following a recommendation by NVIDIA

  • Skyway InfiniBand-Ethernet gateways are updated to 8.1.2000 (from 8.0.2300)

2022-08-30 General maintenance/update (Damian Alvarez, JSC)

Update type: Maintenance, Announcement, OS Packages, General configuration, Batch system, Storage, Network, Other

Compute nodes software update

The compute nodes are updated to:

  • Rocky 8.6 (from 8.5)

  • MOFED 5.7-1.0.2 (from 5.5-1.0.3)

  • GPFS 5.1.4-1 (from 5.1.3-1)

  • NVIDIA driver 515.65.01-1 (from 510.47.03-1)

  • Apptainer 1.0.3-1 (from 1.0.1-1)

  • psmgmt 5.1.50-4 (from 5.1.49-4)

InfiniBand Firmware updates

The following components in the compute infrastructure are updated:

  • ConnectX-6 HCAs are updated to 20.34.1002 (from 20.31.2006)

  • ConnectX-5 HCAs are updated to 16.34.1002

  • ConnectX-4 HCAs are updated to 12.32.1010 (from 12.30.1004)

  • Quantum based switches are updated to 27.2010.3118 (from 27.2010.2110)

  • Switch-IB 2 based switches are updated to 15.2010.3118 (from 15.2008.3328)

Slurm configuration update

SLURM has now the topology plugin active. That enables SLURM to make more adequate decisions with respect to node allocation. It also enables users to use --switches=count[@time] in sbatch and salloc commands, where count is the maximum number of leaf switches used for a job, and time is the maximum time to make the job wait for an opportunity to run.

In the JUWELS cluster the count option matches 1 to 1 to switches.

In the JUWELS booster the count option implies racks rather than leaf switches, given the stripping of the 4 links over different switches in the rack.

GPFS setup on login nodes

The GPFS setup on login nodes has been changed. Access to storage is now done over 100GbE instead of InfiniBand. That allows the login nodes to stay available when major work is being done in the InfiniBand fabric or in case of instabilities. Some nodes were kept in the old setup for evaluation purposes.

2022-05-18 Python clean up (Damian Alvarez, JSC)

Update type: OS Packages

Python 2 and 3.8 have been removed from the system.

2022-05-03 Global maintenance with general updates (Damian Alvarez, JSC)

Update type: Maintenance, Announcement, OS Packages, General configuration, Storage, Network, Other

General update

List of changes:

  • OFED updated to 5.5-1.0.3

  • NVIDIA driver updated to 510.47.03

  • Kernel updated to 4.18.0-348.23.1

  • Slurm updated to 21.08

  • Migrated to apptainer 1.0.1-1

GPFS parameter change
  • GPFS parameters have been changed to optimize metadata performance

2022-04-29 XH2000 IB Switch Update (Damian Alvarez, JSC)

Update type: Network

  • The firmware in the InfiniBand switches in XH2000 has been updated to 27.2010.2110 from 27.2008.3336

2022-04-12 IME Update (Damian Alvarez, JSC)

Update type: OS Packages, Storage

  • IME libraries have been updated from 1.5.1.1-151123 to 1.5.1.1-151130. That fixes a use case when using IME directly from python scripts.

2022-03-08 Change in user installations (Damian Alvarez, JSC)

Update type: Announcement, SW Modules

Change in user installations
  • The module structure has been changed so $MODULEPATH is not expanded depending on the existence of the $PROJECT variable. Now the variable used is $USERINSTALLATIONS, so the project software is not automatically activated when using jutil.

2022-02-15 Stage update (Damian Alvarez, JSC)

Update type: Maintenance, Announcement, SW Modules, Storage, Network, Other

Stage update

The default software stack has been changed to 2022. The remaining software stages are nevertheless reachable.

Fabric components replaced
  • The juwelsg02:SX6036G gateway has been re-added to the fabric after a replacement

  • The jwb-25-L2-02 switch has been re-added to the fabric after a replacement

New HPST (IME) mount point
  • The mount point of HPST changed to /p/cscratch/fs

IB configuration
  • The order of the routing algorithms in the routing chain has been changed. updn is now the first one.

  • ARP settings have been tweaked to favour responding to ARP requests via the correct IPoIB interface

  • SR-IOV has been adapted in the corresponding service nodes to have different node and port GUID

2021-12-17 Rocky update (Damian Alvarez, JSC)

Update type: Maintenance, Announcement, OS Packages, General configuration, Batch system, Storage, Network, Other

Software updates

  • The system has been udpated to Rocky Linux 8.5

  • OFED has been updated to 5.4-3.1.0

  • The nvidia driver has been updated to 470.82.01

  • GPFS has been updated to 5.1.2-1

  • Singularity has been updated to 3.8.5-1

Firmware/BIOS updates

  • The HCA FW has been updated

    • 12.30.1004 (EDR nodes)

    • 20.31.2006 (HDR nodes)

  • The BIOS in the Cluster login nodes has been updated

  • The Technical State on the Cluster has been updated to 45.02

Storage updates

  • HPST (DDN-IME) is now accessible from Cluster and Booster nodes

General configuration updates

  • The sssd cache time has been reduced, so LDAP updates are refreshed faster

  • The priority of the different queues has been updated, to prioritize jobs that need nodes with large memory

Switch exchanges

  • jwc04isw218 has been replaced

  • jwb-27-L2-04 has been replaced

  • jwb-30-L1-02 has been replaced

Other changes

  • The cooling liquid in the Booster racks has been exchanged

2021-10-12 Maintenance (Damian Alvarez, JSC)

Update type: Maintenance, General configuration, Batch system, Storage, Network, Other

OpenSM configuration

OpenSM is configured now with dumping the SA file in a shared filesystem, to improve failover times

Switch replacement

jwb-26-L2-01, jwb-27-L1-03 and jwb-39-L1-05 have been replaced

Update HCA FW in a variety of admin nodes

All the admin nodes have had their HCA FW version synced

largedata available in a subset of compute nodes

This filesystem is now available in a 10 nodes on the cluster, and 10 on the booster. To request it you can use –constraint=largedata in your sbatch/salloc command

Update psconfig and pshealthcheck

These packages have been updated to psconfig-5.2.1-1 and pshealthcheck-5.2.3-1

Overlapping partitions for swmanage users

The following partitions overlap the devel partitions, but without the 2 hour time limit:

  • devel-sw

  • develgpus-sw

  • develbooster-sw

2021-09-14 Module update (Damian Alvarez, JSC)

Update type: Maintenance, SW Modules, Network

New compilers and MPIs

The default compilers have been updated:

  • GCC/9.3.0 -> GCC/10.3.0

  • Intel/2020.2.254-GCC-9.3.0 -> Intel/2021.2.0-GCC-10.3.0

  • NVHPC/20.7-GCC-9.3.0 -> NVHPC/21.5-GCC-10.3.0

With these versions of the compilers the latest available MPIs have been also installed

jwb-16-L2-01 has been replaced
IME-FUSE client config update

The IME client has been updated and the configuration updated

2021-08-10 CentOS 8.4 update (Damian Alvarez, JSC)

Update type: Maintenance, Announcement, OS Packages, General configuration, Batch system, Network, Other

Software update

The system has been updated to

  • CentOS 8.4

  • OFED 5.4

  • gdrcopy 2.3

  • NVIDIA driver 470.57.02

  • psmgmt 5.1.43-0

Switch replacement

jwb-11-L1-05 and jwb-31-L2-01 have been replaced

MOTD announcement

It has been announced that during the next maintenance the default compilers will change to:

  • GCC 10.3 (from GCC 9.3)

  • NVHPC 21.5 (from NVHPC 20.11)

  • Intel 2021.2 (from Intel 2020.2)

2021-07-19 Update and clean up IB fabric (Damian Alvarez, JSC)

Update type: Maintenance, Batch system, Network, Other

Switch replacement

jwc03isw208 and jwc05isw118 have been replaced

Skyway cable mismatches

Fixed the cable mistmatching in the skyways

InactiveLimit=0 in slurm.conf

Set to default

Fix PSID in jwb-02-L2-01

This switch had the wrong PSID and therefore the wrong FW.

FW update in all switches

All the switches have been updated to the latest version

2021-06-29 Skyway replacement, SLURM updates (Damian Alvarez, JSC)

Update type: Maintenance, OS Packages, Batch system, Storage, Network

Update psmgmt
  • Updated to psmgmt-5.1.42-1

SLURM update
  • Minor updated within 20.02.7-1

    • Fixes Spank plugin environment variables

    • Adds an additional check in the submission filter, to submit to the booster queue by default when submitting from Booster nodes

Skyway replacements
  • All the skyway units have been replaced to the GA HW version

    • This includes updating the software to the latest version

2021-06-08 GPFS and SLURM updates (Damian Alvarez, JSC)

Update type: Maintenance, OS Packages, Batch system, Storage, Network

GPFS update
  • Updated to 5.1.1-1

Update psmgmt
  • Updated to psmgmt-5.1.41-0

SLURM update
  • Updated to 20.02.7-1

    • This gets rid of the GTK2 dependencies

Switch replacements
  • jwb-17-L2-03 has been replaced

  • jwb-13-L2-03 has been replaced

2021-05-11 SLURM update (Damian Alvarez, JSC)

Update type: Maintenance, Announcement, SW Modules, Batch system

SLURM update

  • Slurm has been patched to mitigate CVE-2021-31215

UCX as default for ParaStationMPI in the cluster

  • ParaStationMPI uses now UCX as default also in the cluster module

2021-04-16 Technical State update (Damian Alvarez, JSC)

Update type: Maintenance, OS Packages

SLURM change

The old FQDN in JUWELS cluster have been removed from the SLURM configuration

TS Upgrade - TS 44.01

The following components have been updated:

  • PMC

  • EMC

  • WMC

  • TMC

  • HMC

  • BMC

Kernel update

The kernel has been updated to 4.18.0-240.22.1.el8_3.x86_64 in all nodes (top island and compute nodes)

2021-03-25 Acceptance tests (Damian Alvarez, JSC)

Update type: Maintenance, OS Packages, Network, Other

OpenSM testing

A new OpenSM release 5.8.2 has been tested for failover. This version delays the re-registration of clients after the failover takes place, accelerating the process that way. The tests indicate good performance in the fabric and between the cluster and JUST, but no improvement between the booster and JUST over the Skyways.

The version has been reverted to 5.7.3 at the end of the maintenance due to extra problems connecting to the XCST

Updated psmgmt to 5.1.38-3

This new version fixes the X forwarding bug that was preventing slurm from correctly performing it

2021-03-09 Migration of cluster ISMAs (Damian Alvarez Mallon, JSC)

Update type: General configuration

Update of cluster ISMAs

They have been migrated to CentOS 8.3 (both baremetals). The PCS cluster per pair has also been deployed

Update of master nodes

They have been updated to CentOS 8.3 (both baremetals).

jwsm[00-01] recabling

They have been recabled on the booster admin switches

Remove CUDA_VISIBLE_DEVICES from environment on the juwels gpu nodes

With the update of psmgnt the workaround in the CUDA module is no longer necessary. It has been removed

2021-02-23 UCX update, ISMA migration (Damian Alvarez, JSC)

Update type: Maintenance, SW Modules, General configuration, Network

psmgnt update

psmgnt has been updated to 5.1.38-2. This fixes a protocol incompatibility problem with slurmctld and a segmentation fault when using heterogeneous jobs.

The cluster ISMAs have been migrated to CentOS 8
  • DNS have been updated to point to new containers

  • Imaging and configuration has been moved to new containers

  • PMSM has been moved out to a separate container due to its CentOS 7 requirement

OpenSM configuration

OpenSM uses now 2 ports per node.

User modules updates
  • GCC now supports GPU offload

  • UCX has been changed to 1.9.0 from 1.8.1 for both ParaStationMPI and OpenMPI

  • pscom has been updated to 5.4.7-1 (from 5.4.6-1). This fixes an issue where an error on pscom was not properly propagated to psmpi, leaving the job running without progressing

Increased size of /dev/shm

/dev/shm has been increased to 85% of the memory size

2021-02-09 CentOS 8.3 update (Damian Alvarez Mallon, JSC)

Update type: Maintenance, OS Packages, Storage, Network, Other

InfiniBand switches
  • jwc00isw216 has been replaced

  • jwc04isw114 has been replaced

OpenSM configuration
  • Increased the max number of SMPs on the wire to 32 (from 8).

  • Set the number of threads for routing calculating to 0, except for updn with lid tracking, where it is kept as 1

  • Extra HCAs added to the subnet managers to enable multiport MAD pushing

Software updates
  • Updated user exposed nodes to CentOS 8.3

  • Updated to OFED 5.1-2580

  • Updated to NVIDIA 460.32.03

  • Updated to Singularity 3.7.1

2021-01-28 FW updates (Damian Alvarez, JSC)

Update type: Maintenance, OS Packages, Storage, Network, Other

TS update on Booster nodes

Update of BIOS, CPLD and BMC on the nodes. Example node:

NAME BOARD COMPONENT VERSION pm3-bmc84 CER-G BIOS BIOS_RME090.18.25.001 CPLD 1.2 CPLD_CER 1.7 CPLD_CWG 0.1.2 SW0_CWG 1.2 SW1_CWG 1.2 FPGA_RDSTN 2.7 MC 60.39.00.0000

New PCIe switch FW

Version 1.2 fixes the bidirectional BW issue.

Enable assert on NSD checksum error

Enabled on the booster GPFS configuration, to protect JUST from the checksum errors caused by the skyways, that result in disks taken down and general JUST availability

Setup new pscluster containers as part of cluster CentOS8 migration

The isma setup in cell 01 has been migrated to CentOS8

TOP-Lvl Switches: temporary cables

The cabling between cluster and booster has been limited to 5 cables to switches top[43-47]. This is temporary until the 200 links are correctly connected.

IB Switch FW update

Going from 27.2008.1904 to 27.2008.2102 on the HDR switches, and to 15.2008.2102 in EDR switches

Set MTU of the IPoIB Interfaces on the booster to 4000

As part of the skyway stabilization efforts, the MTU has been set to 4000, since the normal 4092 MTU was creating corrupted packages

Modify GPFS cluster on JUWELS (Cluster)

The GC and image have been moved to the new local GPFS cluster, getting rid of the legacy setup.

Update IME config

IME has been updated to 1.4.1.slice-141029 and reconfigured following DDN recommentations

2021-01-12 Various updates (Damian Alvarez Mallon, JSC)

Update type: Maintenance, Batch system, Storage, Network, Other

New Skyway configuration

The Skyway gateways connecting the booster to storage have been reconfigure to support 8x4 HCAs instead of 1x4

Update of SLURM

SLURM has been updated to 20.02.6

Update of GPFS in service nodes and GC

To 5.1.0-1 in both parts of the system

Update of psmgmt

To 5.1.35. Changes the pinning strategy on GPU nodes, to assign GPUs and HCAs properly when using more processes than GPUs

OpenSM configuration changes

Optimizations suggested by Mellanox to OpenSM configuration

InfiniBand work

Recabling of various broken links on the cluster part of the system

Cell00 HYC replacement

pm-hmc1 was faulty and has been replaced

jwslurm[00-01] renaming to jwslurm[01-02]

The newly installed baremetals are now hosting the jw-slurm container

Switch entries in DNS

Switch names and aliases have been added to the DNS. Necessary for IBMS

Modify GPFS cluster on JUWELS (Cluster)

Temporary move the GPFS cluster managers to inactive logins.

2020-12-08 Maintenance for booster acceptance tests (Damian Alvarez, JSC)

Update type: Maintenance, Storage, Network

jwc07isw118 replaced
  • The switch is back in production, and with it the nodes connected to it

New route to a JUST subnet in the cluster images (CPU and GPU)

The following subnet route has been added in the cluster part:

134.94.76.0/23. This should be routed via the gateway corresponding for the pkey interface of the node, and of course over the pkey interface. In other words, these routes, depending on the node:

134.94.76.0/23 via 10.11.168.1 dev ib0.8007 134.94.76.0/23 via 10.11.160.1 dev ib0.8006 134.94.76.0/23 via 10.11.176.1 dev ib0.8008

New routes to a JUST subnet in the booster images

All the routes have been added in all the nodes. Depending on the group of nodes:

134.94.100.0/23 via 10.13.22.11 dev ib0
134.94.102.0/23 via 10.13.22.11 dev ib0
134.94.140.0/23 via 10.13.22.11 dev ib0
134.94.15.0/24 via 10.13.22.11 dev ib0
134.94.74.0/23 via 10.13.22.11 dev ib0
134.94.76.0/23 via 10.13.22.11 dev ib0
134.94.100.0/23 via 10.13.22.12 dev ib0
134.94.102.0/23 via 10.13.22.12 dev ib0
134.94.140.0/23 via 10.13.22.12 dev ib0
134.94.15.0/24 via 10.13.22.12 dev ib0
134.94.74.0/23 via 10.13.22.12 dev ib0
134.94.76.0/23 via 10.13.22.12 dev ib0
134.94.100.0/23 via 10.13.22.13 dev ib0
134.94.102.0/23 via 10.13.22.13 dev ib0
134.94.140.0/23 via 10.13.22.13 dev ib0
134.94.15.0/24 via 10.13.22.13 dev ib0
134.94.74.0/23 via 10.13.22.13 dev ib0
134.94.76.0/23 via 10.13.22.13 dev ib0
134.94.100.0/23 via 10.13.22.14 dev ib0
134.94.102.0/23 via 10.13.22.14 dev ib0
134.94.140.0/23 via 10.13.22.14 dev ib0
134.94.15.0/24 via 10.13.22.14 dev ib0
134.94.74.0/23 via 10.13.22.14 dev ib0
134.94.76.0/23 via 10.13.22.14 dev ib0

As always on the booster, no pkeys, just the main ib interface

Update cluster and booster to psmgmt 5.1.34

Update cluster and booster to 5.1.34

IME software + config update

The IME config on the cluster needs some performance tuning for Skylakes. In addition a new IME version is available (1.4.1.slice-141026)

2020-11-10 SLURM cluster-booster unification (Damian Alvarez Mallon, JSC)

Update type: Maintenance, Announcement, SW Modules, General configuration, Batch system, Network

SLURM merge

Both cluster and booster slurm instances have been merged. It is possible now to submit jobs to both sides of the system

Cell 5

The cell is back in production

InfiniBand network
  • The FW has been updated in all compute HCAs and switches.

  • Cluster and Booster have been re-merged

  • The fabric has been also cleaned up

IME update

New version has been installed

OpenMPI failure in CentOS 8

In some circumstances, when doing MPI-IO one could see this failure

[jwc09n006.adm09.juwels.fzj.de:08897] mca_base_component_repository_open: unable to open mca_fs_gpfs: libevent_core-2.0.so.5: cannot open shared object file: No such file or directory (ignored)
[jwc09n006.adm09.juwels.fzj.de:08882] mca_base_component_repository_open: unable to open mca_fs_gpfs: libevent_core-2.0.so.5: cannot open shared object file: No such file or directory (ignored)
[jwc09n006.adm09.juwels.fzj.de:08907] mca_base_component_repository_open: unable to open mca_fs_gpfs: libevent_core-2.0.so.5: cannot open shared object file: No such file or directory (ignored)

That is a byproduct of compiling in CentOS 7 at the beginning of the stage deployment. It has been fixed during the maintenance by recompiling OpenMPI.

Remove ParaStationMPI GPFS support on ROMIO

Some users reported problems on when using HDF5 on the new stage (on the booster). The issue is reliably resolved when setting ROMIO_FSTYPE_FORCE=ufs:. As relying on that variable alsodisables also IME, ParaStationMPI has been recompiled without GPFS support on ROMIO.

Update cluster nodes to psmgmt 5.1.32-0

To homogenise with the booster, psmgmt 5.1.32-0 has been installed.

2020-11-02 Cluster-Booster InfiniBand merge, CentOS 8 migration and software stack update (Damian Alvarez Mallon, JSC)

Update type: Maintenance, SW Modules, OS Packages, General configuration, Network, Other

Change to 55V on the PSUs
  • Before it was set to 54V

  • Meant to address recent throttling events

InfiniBand FW updates
  • On HCAs and switches

InfiniBand merge
  • Both fabrics have been merged

  • Chain routing setup to have updn for IO, ftree for cluster part, and dfp for the Booster

Update IPoIB addresses
  • For the cluster/booster merge, new IP addresses for the IPoIB devices on the cluster nodes have been assigned

  • PSP_NETWORK has been updated for it

  • New justime IPoIB IPs

Migrate compute nodes to CentOS 8 images
  • Compute nodes have been upgraded to CentOS 8

Move to 2020 stage
  • The 2020 stage has been made default

10GbE card in juwels11
  • For the ceph network

Move software mountpoint
  • From /gpfs/software to /p/software/juwels

Change DNS RR on login nodes to migrate to CentOS 8
  • New login nodes based on CentOS (jwlogin[04-10] and jwvis[02-03])

2020-08-25 Regular maintenance (Damian Alvarez Mallon, JSC)

Update type: Maintenance, OS Packages, GPUs

Cell HW

HYCs in cell 6 and 7 have been replaced TMC in cell 0 has been replaced

MAD control options

The maximum number of in-flight MAD datagrams is now limited to 1, set as a drop-in file in /system/openibd.service.d/ticket4005.conf

New nvidia driver

The driver version 450.56.01 has been installed, for CUDA 11 compatibility.

Reenable Singularity

Singularity has been deployed locally on the logins and compute nodes via (RPM-based) installation.

10 GbE cards for Ceph access

The following nodes have now extra 10 GbE cards:

  • jwm[00,01]

  • jwsm[00,01]

  • jwlogin[00-03]

  • jwvis[00,01]

2020-07-13 Network migration (Damian Alvarez)

Update type: Maintenance, Network, Other

Network migration

The admin network has been migrated to enable the integration with the Booster in the near future

Cell 09 switch backplane

The switch backplane (BOD/S) in cell 09 has been replaced

Ceph network

The ISMA, monitoring and SLURM nodes have been equipped with 10 GbE cards to access in the future the Ceph network

2020-06-23 HW maintenance (Damian Alvarez)

Update type: Maintenance, Network, Other

Replacement of HYC in cell 4
Replacement of switches
  • jwc00isw222

  • jwc02isw208

  • jwc03isw210

  • jwc05isw214

  • jwc07isw[216,218]

  • jwc04isw204

Update of pscluster containers
  • To CentOS 7.8

2020-06-04 Changes after security incident (Damian Alvarez)

Update type: Maintenance, Announcement, SW Modules, OS Packages, General configuration, Batch system, Storage, Network, Other

Security changes

User visible and incomplete list of changes:

  • Revoked ssh keys

  • Revoked ssh host keys

  • Strong recommendation of from clauses in authorized_keys

CentOS update

Update from CentOS 7.7 to CentOS 7.8

Phase rebalancing

The electric phases have been permutated to balance all 3 phases evenly

jwlogXX

Enabled SR-IOV setup

IB Firmware Upgrade

The firmware of the following components have been updated:

  • L1/L2/L3 switches

  • HCA Update for all nodes

New psmgnt 5.1.30

This includes the pinning changes

Rollout slurm role from hps-config

That will imply also a few changes on the compute nodes.

New default modules

Updated defaults:

  • Default PGI module: 19.3 -> 19.10

  • Default Intel module: 2019.3 -> 2019.5

  • Default ParaStationMPI: 5.2.2-1 -> 5.4

  • Default IntelMPI: 2018.5 -> 2019.6

XDG_RUNTIME_DIR not existing in compute nodes

$XDG_RUNTIME_DIR is set on login to /run/usr/$ID. This directory is used by a few programs and libraries (like Qt), and created by pam on a normal system. However, on compute nodes, the directory did not exist until this update.

2020-04-28 Phase verification maintenance (Damian Alvarez)

Update type: Maintenance, General configuration

  • Phase load tests were performed

  • Roles from hps-config deployed on top island:

    • postfix

    • mellanox

2020-03-31 TS update (Damian Alvarez)

Update type: Maintenance

  • Finished update of TS, left incomplete in the previous maintenance

  • Deployed new OpenSM role (from hps-config)

2020-03-17 Technical State Update (Damian Alvarez)

Update type: Maintenance, OS Packages

New TS
  • Updated FW accross the system

Add IME servers to DNS
  • IME servers have been added to the DNS

New packages
  • Minor OS update, including new minor kernel update

2020-02-11 MPI settings modules (Damian Alvarez)

Update type: SW Modules

  • MPI modules now enable the loading of mpi-settings for easier tuning of MPI parameters

2020-02-04 New PGI compiler and Intel MPI version (Damian Alvarez)

Update type: SW Modules

  • PGI 19.10 installed (but not default)

  • IntelMPI 2019.6 installed (but not default)

2020-01-28 Update to CentOS, OFED and bypass installation on cooling loop (Damian Alvarez)

Update type: Maintenance, SW Modules, OS Packages, Batch system, Other

CUDA MPS support on SLURM

JUWELS has now support for CUDA MPS through SLURM. Example:

salloc -p develgpus --gres=gpu:4 -t 40 --cuda-mps zsh

Update to CentOS 7.7 on CPU and GPU nodes
  • The OS on the compute nodes has been updated to CentOS 7.7

  • OFED has been updated to 4.7-3

  • GPFS has been updated to 5.0.4-1

Update to CentOS on the top island
  • The OS on the admin nodes and containers has been updated to CentOS 7.7

  • OFED has been updated to 4.7-3

  • GPFS has been updated to 5.0.4-1

Update psmgmt to 5.1.28

User relevant changes:

  • Add OMPI_* variables to client environment in pspmix

  • Improve Slurm message reply code of psslurm

  • Revert “Enhancement: set SLURM_NTASKS_PER_NODE if it was not set by sbatch”

Default UCX

The default UCX for ParaStationMPI has been changed to 1.6.1

Cooling infrastructure

The primary cooling loop has now a bypass to allow the Sequana cells to control the valves of the primary loop that are on their racks.

2020-01-17 New MVAPICH2-GDR version (Damian Alvarez)

Update type: SW Modules

  • MVAPICH2-GDR 2.3.3 has been installed and made default in production

2020-01-14 Connection to HPST (Damian Alvarez)

Update type: Maintenance, OS Packages, Storage, Network

Connection with HPST
Install HPST client RPMs in the compute images

The following packages have been installed

  • ime-client

  • ime-net-cci

  • fuse

  • ime-common

  • ime-ulockmgr

  • libcci

  • libisal

  • python-babel

  • python-jinja2

  • python-markupsafe

2019-12-10 SLURM and IB fabric updates (Damian Alvarez)

Update type: Maintenance, Batch system, Network

JUWELS IB fabric
  • The FW of all the switches has been updated

SLURM update
  • SLURM is now installed as RPM, and has been updated to version 19.05.4

  • The maximum number of nodes in batch has been increased to 1024

Install fix to resolve increased IB errors when sideband is activated
  • A fix has been installed on the Sequana switches to enable sideband functionality without increasing the IB symbol error counters

2019-11-18 HDR switches, VR 2.2 update and others (Damian Alvarez)

Update type: Maintenance, SW Modules, Network, Other

JUWELS IB fabric
  • The top level EDR switches have been replaced by HDR switches

  • 4 defective L1 and L2 switches have been replaced

  • IBACM has been disabled

Update VR version to 2.2
  • VR 2.2 has been installed on the compute nodes, to fix PVCCIN issues

Supermicro firmware
  • The service nodes BIOS and BMC have been updated to address PCI reordering problems

GPFS client configuration
  • /var/mmfs/mmsysmon/mmsysmonitor.conf has been updated with clitimeout = 32, maxretries = 5, csmspeed = 10 and maxcsmretries = 1 to avoid flooding on the quorum nodes:

2019-11-07_20:50:18.443+0100: [N] The server side TLS handshake with node 10.11.160.76 was cancelled: connection reset by peer (return code 420).
2019-11-07_20:50:18.443+0100: [E] sdrServ: Communication error on socket 3066 (10.11.160.76) @handleRpcReq/AuthenticateIncoming, [err 146] Internal server error
2019-11-07_20:50:18.444+0100: [N] The server side TLS handshake with node 10.11.171.75 was cancelled: connection reset by peer (return code 420).
2019-11-07_20:50:18.444+0100: [E] sdrServ: Communication error on socket 3064 (10.11.171.75) @handleRpcReq/AuthenticateIncoming, [err 146] Internal server error
Update LXC
  • LXC on admin nodes has been updated to 3.0.4 from 1.0.X

juwelsm01 and SELinux
  • SELinux has been enabled on juwelsm01. It was disabled by mistake.

Flexible module naming scheme
  • The user modules in production have been adapted to work with a flexible module naming scheme. Minor updates of compilers and MPIs are possible without full toolchain duplication now.

2019-11-07 IB network updates (Damian Alvarez)

Update type: Maintenance

JUWELS IB fabric update
  • The firmware of the switches and gateways has been updated to the latest version.

  • IBACM has been disabled

Supermicro firmware
  • The BMC and BIOS on admin nodes have been upgraded. jwslurm[00-01],jwsm[00-01], jwvis[00-01] are still pending

psmgmt
  • psmgmt has been updated to 5.1.26. This includes various fixes affecting user jobs, PMIx (for OpenMPI) and PMI (for Intel MPI), and psid crashes.

Update LXC
  • LXC has been updated on jwlogin10 (from lxc-1.0.11-2.el7.x86_64 to lxc-3.0.4-2.el7.x86_64) as a stability test

Updated nvidia driver on GPU partition
  • The nvidia driver has been updated from currently 418.40.04 to 418.87.00

Updated OFED on the login nodes and admin nodes to 4.6
  • The whole cluster is running now on the same OFED version

OS update on computes and logins
  • Minor kernel update

  • Minor update of packages

gdrcopy on the gpu nodes
  • The gdrdrv kernel module has been installed, and the gdrcopy service enabled

2019-10-24 Max jobs in queue (Damian Alvarez)

Update type: Batch system

  • Maximum number of jobs in the queues is now up to 20000 from 10000

2019-10-22 Change in IPoIB qlen (Damian Alvarez)

Update type: Network

  • The QLEN in ib0 has been increased to 4096 from 256. This was causing RDP: MYsendto to XXX.XXX.XXX.XXX(0): No buffer space available errors.

2019-10-17 Changes in nvidia and MVAPICH2 modules - OTRS #1031954 (Damian Alvarez)

Update type: SW Modules

  • The nvidia module has now a link libnvidia-ml.so -> libnvidia-ml.so.1 to allow applications to link to it, instead of using stubs

  • The MVAPICH2 compiler wrappers now point to $EBROOTNVIDIA/lib64 instead of $EBROOTCUDA/lib64/stubs

2019-10-15 Updates in login nodes and large partition (Damian Alvarez)

Update type: OS Packages, Batch system

  • The large partition excludes now the devel nodes and has the same nodes than the batch partition

  • OFED 4.6 libraries have been installed in the login containers

2019-10-10 IPoIB update (Damian Alvarez)

Update type: Maintenance, SW Modules, Network

  • OFED update A new patched driver by Mellanox has been installed, that fixed the IPoIB issues that the system had from the beginning. This update has been also applied on the GPU nodes. The OFED version is now 4.6.

  • Admin nodes The update to CentOS 7.6 in the admin nodes has been completed (including master nodes, slurm and subnet manager nodes). Also GPFS has been updated in the whole partition to 5.0.3.

  • IntelMPI The default Intel MPI module in the current stage (2019a) is now 2018.5.288, instead of 2019.3.199

  • VR firmware The VR firmware has been updated to 2.1 also in Cell 9, following the cells that were upgraded in previous maintenance.

2019-09-30 InfiniBand Update (Damian Alvarez)

Update type: Maintenance, General configuration, Network, Other

Updates in admin nodes infrastructure

Updated log baremetals

Update OFED on the compute nodes to 4.6

OFED has been updated to version 4.6, including a custom patch by Mellanox to address the IPoIB issues present in JUWELS. Scalability has been increased significantly. General roll out needs pending changes in the environment.

/etc/locale.conf in compute images

Locale updated to LANG="en_US.UTF-8"

2019-09-25 (Damian Alvarez)

Update type: OS Packages, General configuration

  • Installation of nvidia-libXNVCtrl in juwelvis[00-03]

  • Set of LANG=en_US.UTF-8 via /etc/locale.conf in login nodes

2019-09-24 VR update (Damian Alvarez)

Update type: Maintenance, General configuration

Update VR version to 2.1 to address throttling events

The voltage regulator firmware has been upgraded to 2.1 to address most of the CPU throttling events we have seen in JUWELS.

Add ib.juwels.fzj.de to /etc/resolv.conf in compute image

Now applications and tools can resolve the hostname from nodes in other cells. This is necessary for some functionality provided by TotalView

OS update in most of the admin nodes

This includes baremetal and containers, but do not include all of them

Increased size of /dev/shm

/dev/shm on the computes nodes is now 85% of the total memory capacity

2019-09-11 Beginning of the changelog (Damian Alvarez)

Update type: Announcement

Initial state of the changelog