Configuration for Jülich Storage Cluster (JUST)

Copyright - FZ Jülich
The configuration of the Jülich Storage Cluster (JUST) is continuously under movement and expansion to integrate newly available storage technology in order to fulfill the evergrowing capacity and I/O bandwidth demands of the data-intense simulations and learning applications on the supercomputers. Currently the 5th generation of JUST consists of 26 Lenovo DSS systems (Lenovo Distributed Storage Solution). The software layer of the storage cluster is based on the Spectrum Scale (GPFS) from IBM. JUST and JUST-DATA provide in total a gross capacity of more than 130 PB.
The lowest storage layer and the backup of user data is stored on tape technology.
For details see
JUST numbers
JUST-DSS |
JUST-DATA |
JUST-HPST |
JUST-TSM |
JUSTCOM |
JUST Total |
|
---|---|---|---|---|---|---|
Capacity |
75 PB gross ca. 50 PB net |
94.6 PB gross ca. 75 PB net |
2.2 PB gross ca. 1.8 PB net |
11.6 PB gross ca. 8 PB net |
4 PB gross ca. 3 PB net |
187.4 PB gross 137.8 PB net |
Racks |
16 |
16 |
4 |
3 |
1 |
40 |
Server |
44 + 5 Mngt + 2 CES |
14 + 1 Mngt + 24 CES (8x3) |
110 |
6 + 9 ISP |
2 + 1 Mngt + 3 CES |
221 |
Disk Enclosures |
90 |
108 |
0 |
9 |
4 |
211 |
Disks (*) |
7516 + 44 SSD |
7936 |
1.100 NVMe |
692 + 6 SSD |
334 + 2 SSD |
14750 |
JUST Tiered Storage

Copyright - FZ Jülich
JUST provides different type of storage repositories to fit various use cases for user data and their work flows.
JUST Architecture

Copyright - FZ Jülich
JUST Hardware Characteristics
JUST-LCST: Cluster JUSTGSS
- 1 x DSS-G 26 (10 TB)
- each 2 x Lenovo x3650 M5 Systems (x-Series)
each 2 x Intel Xeon Processors E5-2690, 14 cores, 2.66 GHz, 384 GB Memory
each 3 x Quad-Port SAS 12Gb HBA
each 2 x Mellanox Dual-Port SFP + 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, DSSG (Spectrum Scale + GPFS Native RAID)
- each 6 x DSS-Storage (JBODs)
each 2 x drawers with 42 slots
each 84 x 10 TB NL-SAS Disks
1 DSS-Storage with 2 x 400 GB SSD (GPFS-GNR Configuration and Logging)
each 502 NL-SAS Disks and 2 SSDs
each 5 PB gross, 3.6 PB net (8+3P)
JUST User Data and Metadata (
$ARCHIVE
)Disk Cache of Archive layer
- 18 x DSS-G 24 (10 TB)
- each 2 x Lenovo x3650 M5 Systems (x-Series)
each 2 x Intel Xeon Processors E5-2690, 14 cores, 2.66 GHz, 384 GB Memory
each 3 x Quad-Port SAS 12Gb HBA
each 2 x Mellanox Dual-Port SFP + 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, DSSG (Spectrum Scale + GPFS Native RAID)
- each 4 x DSS-Storage (JBODs)
each 2 x drawers with 42 slots
each 84 x 10 TB NL-SAS Disks
1 DSS-Storage with 2 x 400 GB SSD (GPFS-GNR Configuration and Logging)
each 334 NL-SAS Disks and 2 SSDs
each 3.3 PB gross, 2.4 PB net (8+3P)
JUST User Data and Metadata (
$SCRATCH
and$FASTDATA
)Large Capacity Storage Tier
- 3 x DSS-G 24 (10TB)
- each 2 x Lenovo x3650 M5 Systems (x-Series)
each 2 x Intel Xeon Processors E5-2690, 14 cores, 2.66 GHz, 384 GB Memory
each 3 x Quad-Port SAS 12Gb HBA
each 2 x Mellanox Dual-Port SFP + 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, DSSG (Spectrum Scale + GPFS Native RAID)
- each 4 x GSS-Storage (JBODs)
each 2 x drawers with 42 slots
each 84 x 10 TB NL-SAS Disks
1 GSS-Storage with 2 x 200 GB SSD (GPFS-GNR Configuration and Logging)
each 334 NL-SAS Disks and 2 SSDs
each 3.3 PB gross, 2.4 PB net (8+3P)
JUST User Data and Metadata (
$HOME
)
- 2 x Management Server (ThinkSystem SR650)
each 2 x Intel Skylake Processors Gold 6142, 16 cores, 2.6 GHz, 384 GB Memory
each 2 x Mellanox ConnectX-4 Dual-Port 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, xCAT 2.13
- 1 x Monitoring
each 2 x Intel Skylake Processors Gold 6142, 16 cores, 2.6 GHz, 384 GB Memory
each 2 x Mellanox ConnectX-4 Dual-Port 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, Prometheus + Alert Manager + Grafana
- 5 x GPFS Management Server (ThinkSystem SR650)
each 2 x Intel Skylake Processors Gold 6142, 16 cores, 2.6 GHz, 384 GB Memory
each 2 x Mellanox ConnectX-4 Dual-Port 100 Gigabit Ethernet Adapter
Software: RedHat Enterprise Linux, Spectrum Scale (GPFS)
- 2 x GPFS-CES (Cluster Export Service) (IBM Power6 520 System)
each 2 x Intel Skylake Processors Gold 6142, 16 cores, 2.6 GHz, 384 GB Memory
each 2 x Mellanox ConnectX-4 Dual-Port 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, Spectrum Scale (GPFS)
JUST-DATA: Extended Capacity Storage TIER (XCST) for community data sharing
- 4 x GPFS building block (Phase 1 - since Q2 2018)
- each 2 x Lenovo ThinkSystem SR650 NSD Server
each 2 x Intel Xeon Gold 6142, 16 cores, 2.6 GHz, 384 GB Memory
each 4 x Quad-Port SAS 12Gb HBA
each 2 x Mellanox Dual-Port SFP + 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, Spectrum Scale (GPFS)
- each 4 x DS6200 storage system
each 3 x D3284 Enclosures
each 252 x 10 TB NL-SAS Disks
each 10 PB gross
User Data and Metadata (
$DATA
)
- 1 x GPFS building block (Phase 2 - since Q1 2019)
- each 2 x Intel Xeon Gold 6142, 16 cores, 2.6 GHz, 384 GB Memory
each 4 x Quad-Port SAS 12Gb HBA
each 2 x Mellanox Dual-Port SFP + 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, Spectrum Scale (GPFS)
- each 4 x DS6200 storage system
each 3 x D3284 Enclosures
each 252 x 12 TB NL-SAS Disks
each 12 PB gross
User Data and Metadata (
$DATA
)
- 1 x GPFS building block (Phase 3 - since Q3 2019)
- each 2 x Intel Xeon Gold 6142, 16 cores, 2.6 GHz, 384 GB Memory
each 4 x Quad-Port SAS 12Gb HBA
each 2 x Mellanox Dual-Port SFP + 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, Spectrum Scale (GPFS)
- each 4 x DE6000 storage system
each 4 x DE600 Enclosures
each 260 x 12 TB NL-SAS Disks
each 12 PB gross
User Data and Metadata (
$DATA
)
- 1 x GPFS building block (Phase 4 - since Q3 2020)
- each 2 x Intel Xeon Gold 6142, 16 cores, 2.6 GHz, 384 GB Memory
each 4 x Quad-Port SAS 12Gb HBA
each 2 x Mellanox Dual-Port SFP + 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, Spectrum Scale (GPFS)
- each 4 x DE6000 storage system
each 4 x DE600 Enclosures
each 260 x 14 TB NL-SAS Disks
each 14 PB gross
User Data and Metadata (
$DATA
)
- 1 x GPFS building block (Phase 5 - since Q4 2021)
- each 2 x Intel Xeon Gold 6226R, 16 cores, 2.9 GHz, 384 GB Memory
each 4 x Quad-Port SAS 12Gb HBA
each 2 x Mellanox Dual-Port SFP + 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, Spectrum Scale (GPFS)
- each 4 x DE6000 storage system
each 4 x DE600 Enclosures
each 260 x 14 TB NL-SAS Disks
each 14 PB gross
User Data and Metadata (
$DATA
)
- 1 x GPFS Management Server (ThinkSystem SR650)
each 2 x Intel Xeon Gold 6142, 16 cores, 2.68GHz, 384 GB Memory
each 2 x Mellanox ConnectX-5 Dual-Port 100 Gigabit Ethernet Adapter
Software: RedHat Enterprise Linux, Spectrum Scale (GPFS)
- 8 x IBM Power S822 for GPFS Cluster Export Service (CES)
each 2 x Power8 Processor, 12 cores, 3,026 GHz, 512 GB Memory
each 3 x Dual-Port 100Gigabit Ethernet
each 3 x LPAR to run virtual node
software: RedHat Enterprise Linux, Spectrum Scale (GPFS), NFS (Ganesha)
JUST-COM: Storage Cluster for Communities (HDFCloud VMs)
- 1 x DSS-G 24 (12 TB)
- each 2 x Lenovo SR650 (x-Series)
each 2 x Intel Xeon Gold 6240, 18 core, 2.6GHz, 384 GB Memory
each 3 x Quad-Port SAS 12Gb HBA
each 2 x Mellanox Dual-Port ConnectX-5 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, DSSG (Spectrum Scale + GPFS Native RAID)
- each 4 x DSS-Storage (JBODs)
each 2 x drawers with 42 slots
each 84 x 12 TB NL-SAS Disks
1 DSS-Storage with 2 x 400 GB SSD (GPFS-GNR Configuration and Logging)
each 336 NL-SAS Disks and 2 SSDs
each 4 PB gross, 3 PB net (8+3P)
Block device for HDFCloud VMs
- 3 x Lenovo ThinkSystem SR630 for GPFS Cluster Export Service (CES)
each 2 x Intel Xeon Gold 6240, 16 core, 2.6GHz, 384 GB Memory
each 2 x Mellanox Dual-Port ConnectX-5 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, Spectrum Scale (GPFS), NFS (Ganesha)
JUST-TSM: Server and Storage
- 8 x ISP-Server (IBM Power System S822)
each 2 x Power8 Processor, 10 cores, 3.42 GHz, 256 GB Memory
each 4 x 16Gbps Dual-Port FC Adapter
each 2 x Mellanox Dual-Port 100 Gigabit Ethernet Adapter
software: AIX 7.2, Spectrum Protect (ISP) Server + Client, Spectrum Scale (GPFS)
- 1 x ISP-Server (ThinkSystem SR650)
each 2 x Intel Xeon Gold 6142, 18 cores, 2.68GHz, 384 GB Memory
each 2 x 16Gbps Dual-Port FC Adapter
each 2 x Mellanox ConnectX-5 Dual-Port 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, Spectrum Protect (ISP) Server + Client, Spectrum Scale (GPFS)
- 1 x NIM Server (IBM Power System S821)
1 x Power8 Processor, 4 core, 3.0 GHz, 32 GB Memory
1 x Quad-Port 10/1 Gigabit Ethernet Adapter
software: AIX 7.1, NIM
- 1 x Hardware Management Console (IBM Power System 7)
Power Systems Management
software: Linux, HMC 8.7.0
- 2 x DSSS 240 (16 TB)
- each 2 x Lenovo ThinkSystem SR650 Modell 7X06
each 2 x Intel Xeon Gold 6240, 18 core, 2.6GHz, 384 GB Memory
each 4 x Quad-Port SAS 12Gb HBA
each 2 x Mellanox Dual-Port ConnectX-6 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, DSSG (Spectrum Scale + GPFS Native RAID)
- each 4 x DSS-Storage (JBODs)
each 2 x drawers with 42 slots
each 84 x 16 TB NL-SAS Disks
1 DSS-Storage with 2 x 800 GB SSD (GPFS-GNR Configuration and Logging)
each 334 NL-SAS Disks and 2 SSDs
each 5.3 PB gross, 3.9 PB net (8+3P)
ISP storage disk pools and ISP logs
- 1 x DSS-G 201 (3.84 TB)
- each 2 x Lenovo SR650 (x-Series)
each 2 x Intel Xeon Gold 6240, 18 core, 2.6GHz, 384 GB Memory
each 4 x Quad-Port SAS 12Gb HBA
each 2 x Mellanox Dual-Port ConnectX-5 100 Gigabit Ethernet Adapter
software: RedHat Enterprise Linux, DSSG (Spectrum Scale + GPFS Native RAID)
- each 1 x DSS-Storage (JBODs)
each 1 x drawers with 24 slots
each 24 x 3.84 TB SAS Solid State Disks (SSD)
1 DSS-Storage with 2 x 400 GB SSD (GPFS-GNR Configuration and Logging)
DB storage device for ISP instances
JUST-HPST: Server and Storage
- 110 x IME-140 server from DDN (DataDirect Network)
each 2 x Intel Xeon Silver 4108 , 16 cores, 1.8 GHz, 92 GB Memory
each 2 x Mellanox ConnectX-6 Dual-Port (3x IB, 1x 100GE)
each 10 x 2 TB Intel NVMe
each 1 x 1 TB NVMe (Commit Log)
software: CentOS, Spectrum Scale (GPFS), Infinite Memory Engine (IME)
cache layer on top of $SCRATCH file system
one global namespace in three slices - JUWELS: 54 server - JURECA DC: 44 server - JUSUF: 10 servers
2.2 PB gross global capacity
2 TB/s total bandwidth
JUST Tape Libraries (Automated Cartridge Systems)
Tape libraries are the most cost-efficient technology in terms of TCO and capacity, but with the drawback of a very high latency. It’s designation is to store cold data, which will be read very seldom or may be never.
We use it in our storage hierarchy for three central services.
Backup and Restore of data
Long term archival of data
- Migration of active (online) data to less expensive storage media
User data in the GPFS archive file systems of the HPC-systems are automatically migrated to tape by the HSM (Hierarchical Storage Manager) component of ISP (IBM Storage Protect). The selection criteria for the migration are age and size of a file. The user data will be recalled automatically and transparently for the user when they are accessed.
Project related data in the dCache-System are automatically migrated to tape by the ISP-HSM API interface of the Pool servers. The data will be recalled automatically and transparently for the user when they are accessed.
Users of the HPC systems, workstations and PCs have access to their data (Backup, Archive, Migration) in the libraries during 24 hours a day and 7 days a week.
Total Numbers
Actual capacity |
~ 347 PB |
Tapes |
~ 37.400 |
Tape drives |
110 |
Libraries |
4 (at 2 different locations in JSC) |
Hardware Characteristics of the Tape Libraries Complex
- 1 STK Streamline SL8500
- Actual capacity: ~ 50 PB
5800 cartridges T10000T2, each 8 - 8.5 TB with T10000D
Tape Slots: 6600
- Tape drives: 20
20 x T10000D
- Transfer rate:
T10000D: up to 240 MB/sec
- 1 STK Streamline SL8500
- Actual capacity: ~ 85 PB
6800 cartridges T10000T2, each 8 - 8.5 TB with T10000D
2000 cartridges LTO7M8, each 9 TB (LTO8)
800 cartridges LTO8, each 12 TB
Tape Slots: 10000
- Tape drives: 38
20 x T10000D
18 x LTO8
- Transfer rate:
T10000D: up to 240 MB/sec
LTO8: up to 300 MB/sec
- 1 TS 4500
- Actual capacity: ~ 174 PB
19372 cartridges LTO7M8, each 9 TB with LTO8
Tape Slots: 21386
- Tape drives: 20
20 x LTO8
- Transfer rate:
LTO8: up to 300 MB/sec
- 1 TS 4500
- Actual capacity: ~ 38 PB
1670 cartridges LTO8, each 12 TB
1000 cartridges LTO9, each 18 TB
Tape Slots: 15.844 (licensed)
- Tape drives: 32
32 x LTO9
- Transfer rate:
LTO9: up to 400 MB/sec
JUST History and Roadmap (for disk based part)

In 2007 JUST started with classical storage building blocks consisting of IBM Power5 servers running AIX and storage controllers with FC and SATA disks like IBM DS4800, DS4700, and DCS9550 and 1 PB gross capacity with a total bandwidth of 6-7 GB/s.
The next milestones were in 2009 starting in March with the replacement of the servers by Power6 systems and in December followed by migration to new generation of storage controllers and disks with IBM DS5300. The capacity grew to 5 PB gross and the bandwidth was about 33 GB/s.
In 2012 additional IBM x-Series servers running Linux and IBM DS3512 and DCS3700 storage controllers with SAS and NL-SAS disks were installed and all data beside the fast scratch file system were migrated to the new technology. The free Power6 servers and storage were added to the scratch file system pushing the bandwidth to 66 GB/s and increasing overall capacity to 10 PB.
In January 2013 the installation and test of about 9 PB gross GSS-24 systems running the pre-GA GSS 1.0 version (with the new GPFS Native RAID feature) started. Mid September 2013 a new generally available fast scratch file system was introduced. At the same time a new special file system dedicated to selected large projects with big data demands was made available. The overall JUST storage capacity was 13 PB and a bandwidth of 160 GB/s could be achieved.
In June 2014 additional 2.8 PB (gross) GSS storage was installed and used for migration of the classical $HOME
file systems
into GNR based file systems. The JUST storage capacity grows to about 16 PB (gross).
In December 2014 it was decided to transfer the remaining classical storage components to GSS-24 systems by reusing the storage infrastucture combined with new x-Series servers. This was done step by step and finished in March 2015. At the end free storage was added to the fast scratch and big data file system increases the bandwidth to about 200 GB/s. At that time JUST consisted of 31 GPFS Storage Server systems (GSS) with a capacity of 16 PB gross.
In June 2015 a global I/O reconfiguration took place to support the new HPC-system JURECA. In all storage servers the 2 times 30 GB Ethernet channels were spitted into 3 times 20 GB Ethernet channels which were distributed over three I/O switches. This implied also recabling. Mid 2015 additional 4 PB (gross) were installed by two capacity optimized GSS-26 storage servers. They were partially used for migration of the HPC archive file systems. The thereby freed storage was added to the fast scratch and big data file systems which increased their capacity by 25% and the I/O bandwidth to 220 GB/s .The overall capacity was 20 PB gross.
In April 2018 the 5th generation of JUST started the production. The old GSS hardware was replaced by new Lenovo Distributed Storage Solution (DSS) systems. The software setup is the same as in JUST4: The parallel file system is based on Spectrum Scale (GPFS) in combination with the GPFS Native RAID (GNR) technology from IBM. 75 PB gross capacity is provided by this new installation.
Two months later the storage cluster JUST-DATA started production which realized a large disk based capacity (40 PB gross) and a moderate bandwidth of 20 GB/s. To match the growing data requirements yearly 12-28 PB will be added. In January 2019 we installed additional 12 PB followed by another 12 TB in September 2019.
In Q3/2020 the JUST-HPST started production limited to selected project for testing. At the same time the JUST-DATA was extended by 14 PB.
The initial object store service started in Q4/2021 available on JUDAC.
In October 2021 the last phase of the JUST-DATA was installed with the additional capacity of 14PB.