It’s spring 2026 – what’s missing at NHR4CES@RWTH ? The CLAIX-2025 HPC system isn’t yet in production! As part of the proven 1-cluster concept, CLAIX-2025 is intended to complement the existing CLAIX-2023 system at the Tier-2 level. We spoke with our HPC experts at the IT Center RWTH Aachen to find out why the commissioning has been delayed and what the current status is. Sascha Bücken, Group Leader “Server, Storage, HPC,” Tim Cramer, Group Leader “HPC CORE,” and Christian Terboven, Chief HPC Officer, answer the question on the minds of researchers at RWTH Aachen and across Germany:
When will CLAIX-2025 go live?
Christian Terboven: Very briefly: We currently plan to have significant parts of the system up and running by the end of April 2026.
And now to the details behind the delay. When was CLAIX-2025 originally scheduled to go live?
Sascha Bücken: CLAIX-2025 was originally scheduled to go live this year. CLAIX-2025 consists of two segments: the classic HPC segment, which was scheduled to launch in March 2026, and the machine learning (ML) segment, which is set to be accepted in June 2026. If we look at this original and contractually agreed-upon schedule, we’re actually right on track! However, in the early phase of the CLAIX-2025 build, we had thought that we could commission the system earlier—before the end of 2025. We communicated this to our users and, up until the fourth quarter of 2025, we also assumed that this ambitious goal could be achieved.
Why didn’t this new schedule work out?
Christian Terboven: CLAIX-2025 is one of the world’s first HPC systems to be delivered with the Cornelis CN5000 HPC interconnect. While this is the successor technology to Omni-Path, it has been significantly revised. As a result, our environment also had to be adapted in many areas and, in some cases, developed from scratch. The network’s stability also had to be established during the weeks of test operation before we could begin the acceptance benchmarks and user applications required for acceptance.
Tim Cramer: In addition, there were unexpected delivery delays beforehand—specifically regarding the import of components into the EU—for the hot water cooling system: The system is operated with hot water cooling, which requires a so-called CDU (Cooling Distribution Unit). This CDU could not be delivered until March 2026.
Where are we currently in the commissioning process?
Sascha Bücken: Currently, the promised performance values are being verified through our own benchmarks. In the area of classic HPC nodes, these have now been achieved and confirmed. With the combination of the new CN5000 network and the GPU components of the ML segment, there is currently still a challenge in individual tests to optimally coordinate the required driver and software components in order to achieve the required and promised performance in this area as well.
What steps remain before commissioning?
Christian Terboven: Once the performance values in the machine learning domain have been achieved, the measurements must still be overlaid with the energy values recorded during the tests. Based on this, the total cost of ownership and thus the performance relative to energy consumption will be evaluated. Additionally, the large CDU must be commissioned to replace the temporary CDUs.
And that should be the case by the end of April?
Sascha Bücken: The first “acceptance with defects” is planned shortly after all benchmark values have been confirmed. This will allow us to grant users access to essential parts of the system, thereby enabling partial production operation. With the replacement of the CDU, the system will reach its full performance capacity, as only then can all components be sufficiently cooled. This step is currently expected around mid-to-late April 2026.
What does this currently mean for users?
Tim Cramer: Unfortunately, this leads to longer wait times—in both the HPC and ML segments—than our users are accustomed to. Since we’ve seen that our users need more computing capacity, we’ve tried to launch CLAIX-2025 as quickly as possible, and all suppliers have cooperated to the best of their ability.
CLAIX-2023 is currently overbooked. We’re working on this: For example, through changes to Slurm, we’ve been working to ensure greater fairness in wait times. Unfortunately, this doesn’t, of course, solve the problem of the lack of resources. We can only ask for your patience at this time! Once CLAIX-2025 goes live, the HPC segment in Tier-2 will triple, and we will be able to offer users the computing time capacity they need!
What can users expect?
Christian Terboven: CLAIX-2025 will provide significant additional CPU computing power to strengthen the field of traditional HPC simulations. Furthermore, a targeted expansion in the area of machine learning represents an important step forward. CLAIX-2025 was designed not only as a response to rising demand but as a strategic investment in the data-driven scientific work of tomorrow.
We will keep you updated on further developments regarding the commissioning of CLAIX-2025. If you have any questions or issues, the IT ServiceDesk team is here to help.