ISC 2021 Keynote: Thomas Sterling on Urgent Computing, Big Machines, China Speculation

By John Russell

July 1, 2021

In a somewhat shortened version of his annual ISC keynote surveying the HPC landscape Thomas Sterling, lauded the community’s effort in bringing HPC to bear in the fight against the pandemic, welcomed the start of the exascale – if not yet exaflops – era with quick tour of some big machines, speculated a little on what China may be planning, and paid tribute to new and ongoing efforts to bring fresh talent into HPC.

Sterling is a longtime HPC leader, professor at Indiana University, and one of the co-developers of Beowulf  cluster. Let’s jump in (with apologies for any garbling of quotes).

The pandemic affected everything.

Thomas Sterling

“It has been a tragedy. There have been more than 200 million COVID cases worldwide, and almost 4 million deaths. And frankly, those numbers are probably conservative, and the actual numbers are much greater. We may never know. In the U.S., shockingly, more than half a million people, 600,000 people, have been killed by this virulent disease. And we’ve experienced over 34 million cases just in the U.S. alone, and our case rate per 1 million of the population is greater than 10 percent,” he said.

“One of the things that came out of this is an appreciation for what has been called urgent computing, the ability for high performance computing in general and the resources, both in terms of facility and talent, to be rapidly brought to bear to a problem, even a problem as challenging as that of COVID-19. Over the year across the international community, very quickly, HPC resources were freed up and made available to scientists. In addition, expert assistance and code development optimization were added to the scientific community to minimize the time of deployment of their code and their application to drug discovery to exploration and to analysis of new possible candidates of cures. In this sense, the high-performance computing community can be proud at the job [done] yet humbled by its own limitations in attacking this problem.”

Fugaku is an impressive machine

“Much of this slide I have used before. The core design is Arm done by Fujitsu and added to that is the use of a significant vector extensions that have demonstrated, in their view, that a homogeneous machine can compete with accelerator machines, and that future designs will be more varied than a singular formula. Is the jury done and the verdict in [on this]? No. As rapid changes are taking place we’ll still see this constructive tension among those. But what we are finding is that the broader range of applications not just in high performance computing, per se, but in AI, in machine learning and in big data and analytics, all of these can be done on machines that are intended for extreme scale,” said Sterling.

“Now, I said extreme scale. Fugaku is not an exaflops Rmax machine, but it comes close. It’s in somewhere around 700*. I apologize to our friends, Satoshi Matsuoka who is standing there in front of his machine. But in the area of lower precision, for intelligence computing, it is indeed an exascale machine. So we are now in an era of exascale if not yet classic exaflops.”

The age of big machines

This era of exascale and exaflops is rapidly dawning around the globe and Sterling briefly reviewed several systems now or soon-to-be rolling out. Importantly, he emphasized, the line between AI and HPC is happening fast and that fusion is greatly influencing HPC computer architecture.

About Frontier, which is expected to be the first U.S. exascale system stood up, he said:

“The Frontier machine has been announced as going to be the U.S.’s first exaflops and by exaflops, I mean an Rmax supercomputing somewhere around – we don’t have the measurements, of course – but the estimates are about one and a half exaflops Rmax. This will be operated in the Oak Ridge National Laboratory or the Oak Ridge Leadership Computing Facility in Tennessee, where the current Summit machine is, and this will be deployed towards the end of this year or the very beginning of the next year. It is being integrated by a Cray division of Hewlett Packard Enterprise and incorporates AMD chips, providing substantial performance and energy efficiency, although it’s predicted that the power consumption will be on the order of 30 megawatts but in a footprint [that’s] somewhat modest of just over 100 racks. The cost is $600 million. That’s a lot of money. [I’m] looking forward to this machine being operated and the science and the data analytics that can be performed with it.”

Sterling gave a brief tour of several of the forthcoming large systems, most of whose names are familiar to the HPC community. Despite being largely accelerated-based architectures, there is diversity among the approaches. He singled out the UK Met Office-Microsoft project to build the Met Office’s next system for weather forecasting in the cloud. That’s a first. He looked at the Euro Joint Undertaking’s Lumi project which will be a roughly half exaflops system.

“[The system] will be in Finland but there are 10 different countries that are involved in the consortium that together will share this machine. You have the list (on the slide below) of such countries starting with Finland and going down to Switzerland. There are multiple partitions for different purposes. So, I think that this is a slightly different way of organizing machines, where distinct countries will be managing different partitions and have different responsibilities,” said Sterling.

About the UK Met-Microsoft project, he noted, “They’re saying that this will be the world’s largest [web-based] climate modeling supercomputer, and this will be deployed a year from now that in the summer of 2022. Its floating-point performance will be 60 petaflops distributed among an organization of four quadrants, each 15 petaflops. There’ll be one and a half million CPUs of the AMD Epyc type, and eventually, I don’t know the year, there will be a midlife kicker, giving it a performance increase by a factor of three. So this will have a long life, indeed a life of about 10 years. What I find extraordinary is that this is a commitment of about one and a half billion dollars over a 10-year period. This is very serious, very significant dedication to a single domain of application.”

Here are a few of his slides on the coming systems.



China is the Dragon in the room

“Okay, so I talked about big machines. And there’s obviously one really big hole, and, you know, maybe what we should say is that’s the big dragon in the room. It’s China, of course, China has deployed over the last decade more than one Top500 machine. And over their evolution of machines they’ve taken a strong, organized and frankly, I’d call it a disciplined approach. In fact, it’s been a three-pronged strategy that they have moved forward. These include the National University of Defense Technology, the National Research Center of Parallel Computer (NRCPC) Engineering and Technology, and third, Sugon, which for those old gray beards, such as myself, we remember as Dawning,” said Sterling.

“All three of these different organizations are pursuing and following different approaches and I don’t know who’s in the lead or when their next big machine will hit the floor, but recently there have been some hints that have been exposed for one of them. And this is the NRCPC Sunway custom architecture. Now, you’ll remember the Sunway TaihuLight. Well, I didn’t know this, but in fact, their plan all along with TaihuLight was designed to be scalable, truly scalable. It was delivering something over 100 petaflops when it was deployed and led the list of HPC systems there and their intent is to bring that up to exascale. Now I use the term exascale as opposed to exaflops for the same reasons I did before. Their peak performance will be floating point. Four exaflops for single precision, and one exaflops for double precision. That’s peak performance. It’s anticipated that their Linpack Rmax will be around 700 petaflops.

“You know, the Sunway architecture is really interesting, because of its use of an enormous number of very lightweight processing elements organized in conjunction with a main processing elements to handle a sequential execution. The expectation is that, as opposed to 28 nanometers, for TaihuLight, this will be 14 nanometers as SMIC, the semiconductor manufacturer fabrication company will provide this at about just under one and a half gigahertz, which is about the same clock rate as TaihuLight. Why? Well, of course, to try to keep the power down. In doing this, they will have eight core groups**, as opposed to the four core groups you see in the lower black and white schematic (slide below), they will double the size of the words or multi-word lines from 256 bits to 512 bits. And they will increase the total size of the machine from somewhere around 40,000 nodes to 80,000 nodes. I don’t know when. But we can certainly wish our friends in China the best of luck as they push the edge of the envelope,” he said.

QUICK HITS – MPI Still Strong; In Praise of STEM

“Within the next small number of months, exactly when I don’t know, MPI 4.0 will be released with a number of improvements that have been carefully considered, examined and debated, including such things but not limited to persistent collaborative, persistent collective operations. For significant improvements in efficiency, and improvements in error handling a number of other as you can see these as well are either going to be in or are going to be considered for later extensions to 4.1. And if you thought that was it, now, there will be an MPI 5.0. The committee is open for new ideas. I don’t know how long this is going to go. But MPI 4.0 coming to an internet place near you,” said Sterling.

Sterling gave nods to various efforts to support HPC students and STEM efforts generally. He noted the establishment of the new Moscow State University branch at the Russian National Physics and Mathematics Center, near Nizhny Novgorod. “I’ve been there, a lovely small city. The MSU Sarov branch is intended to frankly attract the best scientists and students and faculty. No, I haven’t gotten my invitation letter yet and it (MSU) will be directed by our good friend and respected colleague, Vladimir Voevodin shown here,” he said.

Sterling had praise for the Texas Advanced Computing Center which helped South Africa by training its student cluster team by bringing them over to Austin, and “really giving them sort of a turbocharged experience in this area. Dan Stanzione (TACC director) shown here (slide below) also managed to make possible the repurposing of one of their earlier machines and giving it a second life at CHPC in South Africa.”

He concluded with kudos for the STEM-Trek organization led by Elizabeth Leake:

“The final person here is one who frankly, we really need to acknowledge and that is Elizabeth Leake. Now many of you know Elizabeth, she is part of our community and always with a friendly smile. But she is much more than that. She is the founder of STEM-Trek track, a nonprofit organization that is intended to – and let me read this – support scholarly travel, mentoring and advanced skills training in STEM scholars and students from underrepresented demographics in the use of 21st century cyberinfrastructure. I can’t read to you the long list of accomplishments, but through STEM-Trek, students are encouraged and engaged in high performance computing. She has singularly managed to acquire travel grants for students who otherwise, frankly, would never get to see conferences like ISC. You see a picture of her with students I met a couple of years ago. Elizabeth deserves very high praise for all of her contributions.”

NOTES

*  Fugaku’s Top500 Rmax is 442 petaflops and Rpeak is 537 petaflops.

** One observer noted in the ISC chat window during the keynote that Sunway would have six not eight core groups.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Empowering High-Performance Computing for Artificial Intelligence

April 19, 2024

Artificial intelligence (AI) presents some of the most challenging demands in information technology, especially concerning computing power and data movement. As a result of these challenges, high-performance computing Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that have occurred about once a decade. With this in mind, the ISC Read more…

2024 Winter Classic: Texas Two Step

April 18, 2024

Texas Tech University. Their middle name is ‘tech’, so it’s no surprise that they’ve been fielding not one, but two teams in the last three Winter Classic cluster competitions. Their teams, dubbed Matador and Red Read more…

2024 Winter Classic: The Return of Team Fayetteville

April 18, 2024

Hailing from Fayetteville, NC, Fayetteville State University stayed under the radar in their first Winter Classic competition in 2022. Solid students for sure, but not a lot of HPC experience. All good. They didn’t Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use of Rigetti’s Novera 9-qubit QPU. The approach by a quantum Read more…

2024 Winter Classic: Meet Team Morehouse

April 17, 2024

Morehouse College? The university is well-known for their long list of illustrious graduates, the rigor of their academics, and the quality of the instruction. They were one of the first schools to sign up for the Winter Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that ha Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use o Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announce Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. I Read more…

Google Announces Homegrown ARM-based CPUs 

April 9, 2024

Google sprang a surprise at the ongoing Google Next Cloud conference by introducing its own ARM-based CPU called Axion, which will be offered to customers in it Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Leading Solution Providers

Contributors

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire