EuroHPC Summit: Tackling Exascale, Energy, Industry & Sovereignty

By Oliver Peckham

March 24, 2023

As the 2023 EuroHPC Summit opened in Gothenburg on Monday, Herbert Zeisel – chair of EuroHPC’s Governing Board – commented that the undertaking had “left its teenage years behind.” Indeed, a sense of general maturation has settled over the European Union’s supercomputing play, which has spent the four and a half years since its debut in an impressive HPC growth spurt. Now, with six supercomputers under its belt (including two in the Top500’s top five) and more on the way, EuroHPC is facing challenges befitting a supercomputing leader as it prepares to enter the exascale era.

A brief recap

The EuroHPC Joint Undertaking JU) took over the organization of the European Union’s supercomputing efforts from the Horizon 2020 program in September 2018. In June 2019, the new JU – backed by 28 countries – announced sites for its first eight supercomputers. Detailed over the subsequent months, this initial list included five petascale systems and three larger “pre-exascale” systems. Now, less than four years later, four of the five petascale systems and two of the three pre-exascale systems are online. Those pre-exascale systems, of course, include Finland’s LUMI supercomputer (third on the Top500) and the newest of EuroHPC’s systems, Italy’s Leonardo supercomputer (fourth on the Top500). Still in the works: Portugal’s petascale Deucalion system and Spain’s pre-exascale MareNostrum 5 system – more on those later.

Last June, the JU – which is now backed by 33 countries, with more expected to join – announced plans and sites for five new systems: four “mid-range” systems (petascale or pre-exascale) and its first exascale system, JUPITER, which will be hosted by Germany’s Jülich Supercomputing Centre (FZJ). The JU also, in October 2022, selected six host sites for its first quantum computers.

Where do the systems stand?

The EuroHPC Summit didn’t offer tremendous updates or shock announcements. In general, it was a thoughtful exploration of how the JU could move itself forward in an environmentally, economically and politically sustainable manner. With that said, there were quiet roadmap updates for a variety of the JU’s undertakings.

Leonardo: While Leonardo did debut on the last Top500 list, the system hadn’t, at 174.7 Linpack petaflops, quite reached its full power yet. At the summit, it was confirmed that EuroHPC expects the system to hit 240 Linpack petaflops for the next Top500 list.

Deucalion: The lone straggler of EuroHPC’s first five petascale systems, EuroHPC had slated Deucalion for delivery in late 2021. The system’s schedule has slipped a few times, but it is now being “finalized” and is estimated for acceptance in “early autumn” of this year. The delays are, perhaps, not too surprising given the system’s oddball architecture: the Fujitsu-manufactured supercomputer is using AMD CPUs, Fujitsu’s Arm CPUs and Nvidia GPUs.

Tapes are installed for MareNostrum 5. Image courtesy of BSC.

MareNostrum 5: The long-beleaguered third pre-exascale system is finally being installed in the new Barcelona Supercomputing Center (BSC) headquarters. Anders Jensen (pictured in the header), executive director of EuroHPC, confirmed that the system will be inaugurated in 2023 and that it will have a peak performance in excess of 300 petaflops, which lines up with the prior estimate we heard (314 peak petaflops, 205 Linpack petaflops).

Evangelos Floros, head of infrastructure for EuroHPC, elaborated that MareNostrum 5’s network, storage and management nodes have been installed and that BSC is targeting acceptance of the general-purpose partition in June of this year. That partition will be powered by Intel’s Sapphire Rapids CPUs and, Floros said, will constitute one of the largest CPU-only partitions in the world at 90 racks, 6,480 CPUs and 36 Linpack petaflops. The much larger main accelerated partition (Nvidia Hopper GPUs and Sapphire Rapids CPUs, 163 Linpack petaflops) is expected – like Deucalion – to debut in “early autumn” of this year. No word on the two partitions that account for the six straggler petaflops, one of which is up in the air after Intel axed its Rialto Bridge GPU plans earlier this month – the other, based on Nvidia’s Grace Superchips, is probably a safer bet.

JUPITER: EuroHPC’s first exascale supercomputer was targeted for installation in 2023 when it was first announced; much like the U.S. exascale timelines, EuroHPC’s exascale timeline has slipped. Floros explained that the candidate vendors had just been narrowed down and that EuroHPC is aiming to sign the contract for JUPITER by Q4 of this year, with installation beginning in Q1 2024. The JU wants to have at least one “big partition” ready for acceptance by Q4 2024.

A couple other soft details emerged on JUPITER. EuroHPC is aiming for sustained 1 exaflops performance in its primary GPU-accelerated partition; elsewhere, FZJ director Thomas Lippert cited a general target of 1.3 peak exaflops for the system – 20× the peak of its predecessor, JUWELS. JUPITER will be hosted in a containerized datacenter – Floros said that “no concrete building” will be built for hosting JUPITER. The system will be accompanied by over an exabyte of storage. Floros also said that it’s still possible that European technologies, like Europe-built CPUs, could be “part of a potential solution” for “one of the modules” of JUPITER. Thomas Skordas, deputy director-general for communications networks, content and technology for the European Commission, said that the integration of European processors in JUPITER would likely take place next year.

JUPITER’s structure and possibilities. Image courtesy of FZJ.

The four new mid-range systems: This is a murky one. Four mid-range systems were announced alongside JUPITER: Daedalus (Greece), Levente (Hungary), CASPIr (Ireland) and EHPCPL (Poland). However, of those four, Jensen only discussed Daedalus and made reference to “four [systems] on the way” – presumably, those four are Deucalion, MareNostrum 5, JUPITER and Daedalus. Elsewhere, Floros made a reference to two mid-range systems on the way in the 2023-2025 window, each with >20-30 petaflops of performance: one in Greece (Daedalus) and one at CYFRONET (EHPCPL in Poland). As far as we saw, the Hungarian and Irish systems were not discussed, though they were included on a map shared in one of the presentations.

Vis-à-vis Daedalus: the hosting agreement has been signed and the JU is looking to move forward with procurement. To our knowledge, the performance target (>20-30 petaflops) is new information, as well.

The second exascale system and additional mid-range systems: EuroHPC has long planned an initial set of two exascale machines, with the ambition of powering a substantial portion of the second using homegrown technologies. Those plans and ambitions remain, with Floros saying that the aim is to have the second exascale system operational by 2025 and to make that system “complementary” to the first system and more reliant on European technology. Daniel Opalka, head of research and innovation for EuroHPC, said that they hope to include general-purpose processors from the European Processor Initiative (EPI) in that system.

Jensen said that disclosures on that system and additional mid-range systems could be expected in the coming months: “[We] hope to be able to announce both mid-ranges and the second European exascale system in the not-too-distant future.”

The post-exascale era: Even ahead of its first exascale system, EuroHPC is targeting a post-exascale era beginning in 2026. Details on the JU’s vision for this era were scant, save for repeated references to ever-increasing amounts of homegrown technology powering post-exascale EuroHPC systems, with “sovereign EU HPC” around 2029. Sergi Girona, operations director for BSC, at one point discussed MareNostrum 6 as a post-exascale supercomputer targeted for installation and production in the 2029-2030 range, citing the need to have a long-term achievable vision. Perhaps the most telling quote on the post-exascale era, though, came from Lippert: “Please: let’s get exascale right.”

An estimated timeline of EuroHPC’s ongoing efforts and plans. Image courtesy of Evangelos Floros.

Priorities and challenges

The pushes toward exascale capability and sovereignty should be clear from the above, but the summit also saw a variety of other issues raised. Zeisel outlined a series of challenges as the JU moves toward updating its strategic plan: faster development (“We need more speed”), user representation, considering the communities served by PRACE, establishing an experimental system and more. Two themes in particular pervaded the talks: industry and energy.

Industry: Skordas discussed plans to expand industrial access to supercomputers through EuroHPC, including a call for expression of interest in hosting and operating an industrial-grade supercomputer. That computer, he said, will be specifically designed for industrial requirements like secure access, protected data and increased usability. “Here, there is an increasing interest within the private sector to have dedicated supercomputing infrastructure for industry needs,” Skordas said, “including specialized capacities for large AI models.”

The summit included a plenary on the “appetite for industrial-grade supercomputers in the EU.” During the talks and discussions, the participants made clear that the appetite for such machines was real, but that their success depended significantly on price and security. (NCSA’s Brendan McGinty was on hand to discuss the U.S. perspective, referencing the data security-targeted Nightingale system that we just covered.)

Energy: Two of Zeisel’s priorities focused on energy and climate: first, reducing EuroHPC’s energy costs; second, reducing its carbon emissions. Throughout the summit, panelists and speakers were extremely vocal about the need to manage energy use by forthcoming systems, citing the staggering energy budgets for systems like Frontier and Fugaku and casting dire projections of future HPC energy use based on current trends.

The LUMI datacenter. Image courtesy of CSC.

Consensus, it seemed, was building around the importance of location. “Technical solutions, such as processors … are important,” Zeisel said. “But as the LUMI example shows to us, comprehensive system-level solutions are essential and may be the key for that.” LUMI, sited in northern Finland, is colocated with fully renewable hydropower and warms nearby houses with its waste heat. Due to its far-north location, it also requires less cooling.

Pekka Lehtovuori, director of operations for CSC – LUMI’s host – naturally agreed. “The most important [choice] is to choose to use green energy,” Lehtovuori said. “Everything after that is a plus, but can’t compensate [for] the effect of the original choices.” In his presentation, Lehtovuori argued that location mattered “many orders of magnitude more than system- and operational-level optimization.”

Tor Björn Minde, a director at the Research Institutes of Sweden (RISE), concurred – “Location is very important” – and illustrated how the same 10MW datacenter in different parts of Europe produced extraordinarily different carbon emissions. “The Finns did the right thing,” Minde said.

Of course, other strategies were proposed since, as Lehtovuori noted, the “political choices” for supercomputer sites still led to large systems that could not sit idly for long periods of time and need to be operated where they are installed. To that end, various speakers discussed code optimization, load shaping in response to grid prices, incentivizing users by prioritizing energy-to-solution and utilizing in-memory computing to minimize data movement. Interestingly, Floros noted that the JUPITER procurement includes options for vendors to leverage energy efficiency to move expected operations savings into the acquisition costs.

EuroHPC as a supercomputing leader

Despite the lack of major announcements, the EuroHPC Summit had an air of triumph about it. It was a well-earned victory lap: apart from the impressive system launches over the past few years, Jensen noted that the JU had doubled its staff last year and, to date, had awarded over 1.5 billion core-hours on its systems. For the first time, calls to establish Europe as a “world power in HPC” sounded a little strange – after all, hasn’t it become one?

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Empowering High-Performance Computing for Artificial Intelligence

April 19, 2024

Artificial intelligence (AI) presents some of the most challenging demands in information technology, especially concerning computing power and data movement. As a result of these challenges, high-performance computing Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that have occurred about once a decade. With this in mind, the ISC Read more…

2024 Winter Classic: Texas Two Step

April 18, 2024

Texas Tech University. Their middle name is ‘tech’, so it’s no surprise that they’ve been fielding not one, but two teams in the last three Winter Classic cluster competitions. Their teams, dubbed Matador and Red Read more…

2024 Winter Classic: The Return of Team Fayetteville

April 18, 2024

Hailing from Fayetteville, NC, Fayetteville State University stayed under the radar in their first Winter Classic competition in 2022. Solid students for sure, but not a lot of HPC experience. All good. They didn’t Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use of Rigetti’s Novera 9-qubit QPU. The approach by a quantum Read more…

2024 Winter Classic: Meet Team Morehouse

April 17, 2024

Morehouse College? The university is well-known for their long list of illustrious graduates, the rigor of their academics, and the quality of the instruction. They were one of the first schools to sign up for the Winter Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that ha Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use o Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announce Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. I Read more…

Google Announces Homegrown ARM-based CPUs 

April 9, 2024

Google sprang a surprise at the ongoing Google Next Cloud conference by introducing its own ARM-based CPU called Axion, which will be offered to customers in it Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Leading Solution Providers

Contributors

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire