“LUMI is officially here!” proclaimed the headline of a blog post written by Pekka Manninen, director of science and technology for CSC, Finland’s state-owned IT center. The EuroHPC-organized supercomputer’s most powerful partition – which ranked third on the most recent Top500 list with 309.1 Linpack petaflops – first appeared in public rankings last June. Now, Manninen says, the system is “officially ready to serve European scientists in its full capacity!”
The HPE-built LUMI system (its datacenter pictured in the header courtesy of CSC) has six main partitions. Three are compute partitions: the GPU-powered LUMI-G is by far the most powerful, with 2,560 nodes running on the now-famous combination of AMD Epyc “Milan” CPUs and AMD Instinct MI250X GPUs; the CPU-powered LUMI-C has 1,536 Milan-based nodes, including a handful of large-memory nodes; and the data analytics-focused LUMI-D has just 16 nodes with Milan CPUs and between 2TB and 4TB of memory per node. The remaining three are storage partitions: LUMI-P (main storage), LUMI-F (flash storage) and LUMI-O (object storage).
LUMI-C hit the scene in November 2021 and has stayed constant in size across subsequent Top500 lists. LUMI-F and LUMI-P, Manninen said, have been in production since the start of 2022. LUMI-G – understandably – has taken more time to install and become fully operational; CSC completed the pilot phase for LUMI-G just a couple of weeks ago. LUMI-O, the object storage partition, is entering its pilot phase now, but is still available for general use during that time (with possible interruptions); LUMI-D, the data analytics partition, is operational but, for now, has limited availability. The system, during testing, was successfully benchmarked on software in key areas, like Gromacs (molecular dynamics) and ICON (climate science), alongside more standard benchmarks like MLPerf.
In short: LUMI is up and running. However, as Manninen explained, it wasn’t without its hiccups.
“Setting up such a huge system as LUMI has not been smooth sailing all the time,” he wrote. “We have suffered from many ‘black swans,’ such as the global shortage of microelectronics, and the global Covid-19 pandemic was something that none of us could predict when the project started in 2019.” What’s more, he added, LUMI featured all-new components: “Cutting edge oftentimes turns into bleeding edge, and LUMI was not an exemption.”
While the system is now accepted and has entered general availability, there remains – as one would expect – plenty of work to do.
“[It] will take more time and effort before we can allow sensitive data to be processed on the system,” Manninen wrote. “In addition, the container cloud platform envisioned to support persistent services such as data mover utilities, web interfaces to LUMI-O datasets, job submission portals etc., will not be available for a while. There will be workarounds available for covering most of its use cases.”
The system itself will see expansion from its current state, too. Manninen says that both LUMI-C and LUMI-G will “grow by [a] non-negligible amount of further capacity … this spring,” LUMI-F will grow by two petabytes and more bandwidth will be added to the interconnects between cabinets. LUMI will also be “a part” of the recently announced LUMI-Q quantum computer, slated to be sited in Czechia in the coming years.
To learn more, read the CSC blog post here.