Return to normalcy is too strong, but the latest portrait of the HPC market presented by Hyperion Research yesterday is a positive one. Total 2022 HPC revenue (on-premise and cloud) will likely wind up around $38.5 billion, roughly 10% over 2021 ($34.8 billion), and next year, 2023, is forecast to hit in the $42.7 billion range. Post-pandemic HPC is bouncing back, if perhaps a bit unevenly.
Lingering supply chain issues, both pandemic- and geopolitical-related, continue to cloud the outlook. There are also worries of a potential broader economic slowdown. Stir all of that altogether and these remain difficult days to be in the forecasting business. On the plus side, demand for AI technology (chips and systems and software) remains strong. Use of cloud-based HPC is surging. Sales at the top end of the market remain strong. And storage, a perennially robust HPC market, is getting an added boost driven by the data demands of AI and IoT.
The three charts below are a good rough summary of Hyperion’s combined review-and-look-ahead for HPC.
In his introductory overview, Hyperion CEO Earl Joseph said, “The first half of 2022 is looking a little soft, Last year (2021) grew fairly well at 9%, but right now supply issues are holding back the first half of 2022. We still expect the second half to come on much stronger. The upper end of the market, the supercomputers, continues to grow a bit better [than the rest of the server market]; it grew 5% in the first half of the year. There are [also] a number of high growth areas – AI, machine learning, and deep learning areas are growing at close to 30% a year; using clouds to run HPC workloads is growing between 17-to-19% a year; and GPUs and storage have very high growth rates too.”
It’s fair to say there’s still uncertainty hanging in the air. Joseph told HPCwire that Hyperion had been predicting a 10-11% growth for on-prem servers for 2022/2021 but recently moderated that to around 7% for the year. He also singled out what he called “two big constraints” in the market including the need to modernize software, and the lack of technical experts in the industry. The latter is part of HPC’s ongoing workforce challenges.
The update was presented virtually with a series of pre-recorded presentations on various market segments (e.g. overview, exascale, storage, AI, HPC in the cloud, etc.). Presented this way, it should actually be easier to zero in on segments of most interest. Hyperion has posted links on its website to videos of segment presentations and to the associated slide decks.
SERVERS – BIG, MEDIUM, AND SMALL
The big system HPC boom – spurred by the global chase for exascale – continues. In recent years, Hyperion has subdivided the supercomputer into chunks (slide below). Exascale/leadership class systems, the top of the supercomputer group, are projected to show 16.5% CAGR overall during the 2020-2026 period but flatten in 2025. Overall, all HPC servers are expected to grow 6.9%.
“We are expecting a slight downturn in the exascale systems around 2025. This is not due to a reduction in the number of exascale machines, it’s due to the price head coming down,” said Joseph. “Fugaku came in at a $billion, which was pre exascale. The US first three exascale system [are] around 600 million, [and] we expect that to migrate down to more like $350 million in the future.”
Workgroup server sales continue to be modest. Joseph noted, “The workgroup sector gets hit the most and takes the longest to recover. And we’re also in a situation where a number of workgroup buyers are looking to the cloud and they found the cloud offers some very substantial opportunities for them.” Divisional HPC servers represent the strongest performer with a projected 9.4% CAGR.
HPE (34.2% market share) and Dell (21.8%) continue to dominate the HPC server market although Joseph cited strength among Chinese providers and Penguin (now part of SGH). Hyperion also took a moment during its update to review the way it classifies HPC servers (slide below) into broad data-intensive and compute intensive classes with sub-categories. Analyst Thomas Sorensen, in his global AI segment talk, noted a server “is considered HPC enabled AI, when the machine dedicates 50% or more of its cycles to AI applications.”
EXASCALE’S GLOBAL PICTURE
No HPC update would be complete this year without a section on the global exascale race. The U.S. stood up its first exascale system, Frontier, at Oak Ridge National Laboratory, earlier this year (See HPCwire coverage, Exclusive Inside Look at First US Exascale Supercomputer). Frontier, is an HPE/AMD system that delivers 1.102 Linpack exaflops of computing power in a 21.1-megawatt power envelope, an efficiency of 52.23 gigaflops per watt. It currently sits atop the Top500 list.
“It has a peak performance of about 1.68 exaflops and only used – I can’t believe I used the phrase only – 21 megawatts to run Linpack, which is actually a relatively impressive power performance metric,” said Bob Sorensen, Hyperion senior vice president of research who reviewed the global exascale activities.
Next up for the U.S. exascale program being run by the Department of Energy is Aurora, at Argonne National Laboratory. “It’s a 60-megawatt system and is a very aggressive design. We’re all anxiously waiting to see how that turns out.” Aurora is based on Intel CPUs and GPUs and has weathered much-discussed delays; it’s now scheduled for being stood up in early 2023. It’s expected to hit 2 exaflops and Intel is betting big on the outcome. The third planned U.S. exascale system is El Capitan, at DOE’s National Nuclear Security Administration, based at Lawrence Livermore National Laboratory. It is scheduled to be deployed in 2023.
China’s exascale program has been veiled in secrecy in recent years, amidst worsening U.S.-China relations, including technology export restrictions. Sorensen tried to peel back some of the covers.
“Just to give you [a sense of] the official situation on what’s happening on Chinese exascale, this (slide below) is what was announced in 2018 in terms of Chinese activities in exascale. These were the prototype systems that were developed [and] kind of the goals that they had laid out. Sunway Pro OceanLight, Tianhe-3, and Sugon systems, all different architectures. That [was] in 2018. What’s happened since [are changes] that have in fact been driven more by political machinations than any kind of technological implications,” said Sorensen.
“The unofficial reality of what’s going on China right now is, is the Sunway Pro OceanLight system has been up and running since March 2021. The Tianhe-3 has been up and running since perhaps late last year. So, in theory, there may be two systems in China right now that have actually achieved exascale capability,” he said.
The bottom line, said Sorensen, “is that China has not done any official announcements. They have not made any entries [to the Top500 list] on the June 2021, November 2021, and June 2022 list, and they may not do this this time around either. This is primarily we believe a political decision not to do anything that would further increase U.S.-China high-tech trade friction. What [U.S. export controls] have done is caused China to really drive their indigenous capability a little bit harder, and so no one wants to fan the flames there, I think, by bringing Chinese systems out on the Top500 list.” He said there is strong evidence that at least five existing Chinese advanced computer systems “could make” the top ten in the Top500.
Shifting the focus to Europe, Sorensen said, “The EU [is] going forward [and] there are definite plans for exascale systems in the 2021 to 2024 timeframe. There are two key points about Europe. One of the exascale systems is most likely going to have European technology, specifically using a European processor initiative-developed processor. Not only that, but there’ll be some additional procurements from the national programs, most likely in Germany, maybe in France as well.
“The other point I want to make is that Europe, perhaps much more than the United States, is very interested in running their HPC exascale environment in a way that dovetails nicely with their vision for quantum computing. It’s a much more integrated program. As you see here (slide below) basically their plans going for the post-exascale environment is to think about hybrid classical-quantum systems as a way to achieve the potential high performance. I think that’s a really interesting aspect of what’s going on in the EU from their perspective.”
The staggering bigness of exascale machines can mind-numbing. Sorensen presented an intriguing idea for the future.
“I’m interested in and looking at what’s happening next on post-exascale architectures. I talked about some of the large systems that are going into the U.S. and part of me believes that the trajectory of bigger, more expensive, more powerful, longer lead-time kinds of HPC that we’ve seen in the past may have reached an evolutionary endpoint,” he said.
“We’re here at Hyperion Research, looking at trends more towards smaller systems, much more workloads-specific HPC architectures geared towards solving a specific class of workloads with their unique architectural requirements. So instead of buying one big machine that can serve perhaps a wide range of applications, you’ll be buying a series of smaller systems that are more efficient, that are better targeted, that are cheaper, and perhaps more energy efficient for the specific workloads at hand,” Sorensen said.
STORAGE SNAPSHOT
Storage continues to command a large share of the overall on-premises infrastructure budget storage and has the highest growth rate – 8.6% CAGR through 2026. “Almost half of the sites surveyed [Hyperion annual site survey] expect their storage budgets to increase in the next 12 months by more than 5%,” said Mark Nossokoff, Hyperion senior analyst. It seems as if the data deluge never slows.
Key demand drivers, said Nossokoff, include the increasing amount of data being created to feed the growing adoption of data intensive AI and HPC workloads across both HPC and enterprise IoT datacenters. He also cited the development of larger models, more accurate AI training, and higher resolution for more precise analysis from traditional HPC modeling and simulation. These are not especially new demand drivers with AI adoption in full swing across many IT segments.
Interestingly, HPC’s move to the cloud is also providing an even greater market growth opportunity. “Approximately one-third of user spending in the cloud is spent on storage,” said Nossokoff, “with two-thirds of that spent on persistent durable storage, and one-third on ephemeral storage.” Hyperion reports $1.7 billion spent on cloud storage for HPC in 2021.
The big guns of on-premise storage are maintaining their position according to Hyperion data. Dell Technologies (21% preferred) remains the most preferred storage vendor among sites surveyed. IBM was second (14%). DDN, a consistent top performer in HPC, was most preferred independent storage provider.
“There was also no discernible difference between largest systems and all systems relative to where sites procured their storage systems. Users showed a strong preference for acquiring their storage from their HPC system vendor at approximately two-and-a-half-to-three times that from independent storage vendors. This is also largely unchanged from last year,” said Nossokoff.
The Hyperion update covered far too much material to adequately capture in a short article. Among the many topics touched on were: networking technology and the ongoing tug-of-war between Ethernet and InfiniBand; a more granular look at HPC cloud use patterns; a snapshot of the nascent but suddenly boisterous quantum computing market; the HPC application market; insights from Hyperion’s latest multi-client study (MCS), emerging buyer preferences for CentOS alternatives, and winners of this year’s HPC innovation awards.
Link to Hyperion presentations: https://hyperionresearch.com/hpc-market-update-webinar-pre-sc22/
BONUS SLIDES