For a change, said Thomas Sterling, long-time ISC keynoter and HPC pioneer, picking a theme for his 2022 talk wasn’t a challenge.
“What is the word you need that everyone will remember and agree to about a particular year? This was easy. It’s exaflops. I mean real exaflops, you know, the kind you can get your teeth into exaflops, not words like exascale or low-precision exaflops, or ‘we know what an exaflops is, right.’ No, Rmax [double precision performance],” said Sterling.
Hard to argue with that. Frontier’s groundbreaking 1.102 exaflops Linpack run to capture the latest Top500 crown has officially started the exascale era. Sterling was delighted to echo the news. This was his 19th ISC keynote, a feat in its own right. Sterling is professor of intelligent systems engineering and director of the AI Computing Systems Laboratory at Indiana University. He is perhaps best known as the “father of Beowulf” cluster architecture.
There was, as usual, a great deal more in Sterling’s talk (HPC Achievement and Impact 2022) – surging HPC in Europe, HPC’s identity crisis, worries-and-triumphs from the pandemic, and a brief spotlight on the Event Horizon Telescope’s computationally intense effort to image the black hole at the center of the Milky Way – but exascale computing took center stage. Presented here are a few of Sterling slides and comments (lightly edited). ISC has archived the video and will provide access to ISC attendees through the end of the year.
Frontier and the Exascale Tour de Force
Frontier’s performance, said Sterling, was a tour de force not only for itself but also for the wider array of efforts by DOE and the broad HPC computing community to reach exascale computing.
“Frontier is not just a machine. Frontier is the result of a process. That process has been taking place at Oak Ridge National Lab under the broader [guidance] of the Department of Energy, but also [with] other major agencies and goals of the U.S. And it is, in fact, [the result of] Titan, and Summit, and the dedication and commitment by many to making heterogeneous computing work for application programmers. That has to do with applications and algorithms. It also has to do with system software and tools. And yes, it has to do with architecture, and technology too. This is the culmination of more than a decade of focused attention to detail and maturation of an approach to HPC. They have produced this exciting result, which will do nothing but move the field forward,” said Sterling.
Sterling cited the seminal role of the High Performance Computing & Communications (HPCC) program, established through the High-Performance Computing Act of 1991: “The HPCC program thirty years ago, which was [focused on] making really practical and effective HPC. So this was not just machines. This was also applications and systems software and tools.” He also praised the work of Exascale Computing Project (ECP) which focused on software and worked hand-in-glove with the hardware efforts to ensure there was a viable software ecosystem for new exascale systems, of which Frontier is first.
What is HPC (and What is AI)?
While much of Sterling’s talk was celebratory, his darker comments on HPC’s so-called identity crisis were at least as interesting. HPC’s identity confusion has been a growing topic. The decline of Moore’s law, the rise of AI technologies and their infusion into HPC, and the emergence of heterogeneous systems have left many wondering what it means to be doing high performance computing. Sterling saved these comments for the end and didn’t offer an answer.
“What is HPC? I leave that as a question. I don’t attempt to answer it. But I will convey the sense that [if] HPC is a leading-edge technology with the focus on computing and we asked the question: what is performance? That may seem like a stupid question, right? High performance, what’s to talk about?
“Is it throughput? Or is it time-to-solution?”
Scanning the crowd, he said, “Those of you who look like we could have met 20 years ago here. Some of you would agree that it’s time-to-solution. Now I value the concept and the growth that has occurred with capacity computing. But this is a very serious problem because if it can’t include time-to-solution, then we’ve already put up a barrier that we will never pass,” said Sterling.
“The other question,” he said, “is what is AI? I was very afraid to do this slide and the fact is I didn’t want to tick anybody off (slide below). Here’s the thing. First of all, artificial intelligence is a terrible term. Yes, it might have implied intelligence from an artifact. The term was coined in 1958 or ’59. But it could also mean artificial as in not real and, honestly, the way we’re using it is more the latter. Let me simply say that not all of AI is machine learning, or deep learning, and not all of machine learning or deep learning is machine intelligence. I ask you to think about that.
“I think a right term would be machine intelligence and I’ll tell you what I think it needs to do — and this isn’t the Turing test. I think Turing was joking when he did it. It’s testing a human being against a computer and you wonder why would a computer: a) go along with that? and b) behave as stupid as humans are? So, meaning no disrespect to the guy (Alan Turing) on the [50£] bill in the UK, machine intelligence is machine understanding. Call that the Sterling test. If the machine understands – and that’s more than Winograd blocks rule; great piece of work in 1972 – then I think we’ve achieved machine intelligence,” said Sterling.
Not sure how much that clarifies things but it gives pause for thought.
Resurgent European HPC
Throughout ISC22, the robust resurgence of HPC in Europe was a central theme. Sterling noted progress under the European High Performance Computing Joint Undertaking (EuroHPC JU) program. He also noted the longer-term growing accessibility of HPC resources in Europe and showed a slide from an earlier Top500 presentation.
“It isn’t the splashy slide, but it’s the important one. Internationally, the field of HPC has converged in the available performance, whether it’s in peak or it’s spread across many deployed systems,” said Sterling
Sterling briefly talked about Europe’s growing efforts in quantum computing, particularly those focused integrating quantum computing with classical resources. He also said he thought quantum computers would not become general-purpose systems but would be used as accelerators for select problem types. Talking about the European Processor Initiative (EPI) to develop a microprocessor that not only emphasizes performance and utility, but also energy efficiency, he said, “I’ll simply say this is a major undertaking.”
Fugaku Today (and Yesterday); Zettascale tomorrow?
While Frontier is the new king of HPC, Sterling heaped praise on Fugaku, now second on the Top500. He lauded Fugaku’s race to deployment, impressive performance, and role as part of the global mobilization of HPC resources to fight the Covid pandemic.
“The science is amazing that we (the world) were able to build vaccinations that have a very strong positive effect. This brings me back to Fugaku because Fugaku, was more than [just] a delivered machine. When Covid-19 hit world, the people at Riken (and MEXT) decided they had to step up and accelerate the deployment of Fugaku by many months, in my opinion, close to a year. Even more amazing. They killed the red tape. Now, this probably saved time when administration was removed and collaborations were built between scientists who would design the new drugs and understand the mechanisms [of] the disease and at the same time [with] the computational scientist who knew how to use the machines, but didn’t know that microbiology. Fugaku became the centerpiece within Japan of driving this momentum forward. It was an extraordinary accomplishment.
“I also want to mention that the Japanese are not resting on their laurels, they’re already talking about their next machine to be delivered in 2029. Satoshi Matsuoka [speaking at ISC 2022] was not prepared to commit to a zettaflops. I’m not sure he was actually joking, but they are already working towards higher density, what they want to have the equal and significant impact,” said Sterling.
Reaching for the Sky with Event Horizon Telescope
A signature element of Sterling’s keynotes is tribute to a scientific achievement made possible by high performance computing. This year he talked about work by the Event Horizon Telescope [EHT] to image and analyze the black hole – Sagittarius A (estimated at four million solar masses) – at the heart of the Milky Way. It took 10 exaflops of compute over two days to resolve the image.
Showing a photo of Sagittarius A, he contrasted it with M87 – a photo of which he’d presented in a past keynote. M87 is one of the largest black holes so far identified. It is 1000 times larger than Sagittarius A and ~55 million light years away from Earth. “[The EHT is] a virtual computer but also a virtual antenna. This radio telescope comprises a half a dozen or so radio telescopes around the world. One of its virtues is one is able to point this virtual antenna at one object and watch it continuously as our Earth turns,” said Sterling.
“Believe it or not, it’s easier to look at a supermassive, distant black hole than it is to look at the center of the Milky Way, which is a mere 26,000 light years away. Why is this? The answer is because we get to look down on the top of those other spiral galaxies, while we’re looking directly into the dust clouds of our own. This is much harder. It’s also much smaller. The lower picture [in the slide] kind of gives you the impression of how tiny our own black hole is at 4 million solar masses. [Capturing the image and analyzing it] was only made possible by the use of high performance computing. And by the way, they couldn’t even use the internet to move the data.”
As always, Sterling packed a lot into his talk. He noted that AMD chips power many of the new systems. He mentioned the trend of HPC systems having longer lifespans. AI’s appetite for compute cycles would dominate HPC in the future, he suggested. Like many he cheered Cray’s strength after being purchased by HPE. He lauded the Advanced Graphic Intelligence Logical Environment program (AGILE) – hated the name, loved the mission: “The AGILE program aims to develop revolutionizing computer architectures and associated integrated circuit designs for a new class of high-performance, high efficiency, scalable computers that meet the needs of large-scale data analytic problems.”
We’ll see what next year brings.
Thomas Sterling is a Full Professor of Intelligent Systems Engineering at Indiana University (IU) serving as Director of the AI Computing Systems Laboratory at IU’s Luddy School of Informatics, Computing, and Engineering. Since receiving his Ph.D. from MIT as a Hertz Fellow in 1984, Dr. Sterling has engaged in applied research in parallel computing system structures, semantics, and operation in industry, government labs, and academia. Professional affiliations have included Harris Corp., IDA Supercomputing Research Center, NASA (GSFC, JPL), Un. of Maryland, Caltech, and LSU. Dr. Sterling is best known as the “father of Beowulf” for his pioneering research in commodity/Linux cluster computing for which he shared the Gordon Bell Prize in 1997. His current research is associated with innovative extreme scale computing through memory-centric non von Neumann architecture concepts to accelerate dynamic graph processing for AI including ML. In 2018, he co-founded the new tech company, Simultac LLC, and serves as its President and Chief Scientist. Dr. Sterling was the recipient of the 2013 Vanguard Award and is a Fellow of the AAAS. He has been selected this year to be inducted in the Space Technologies Hall of Fame. He is the co-author of seven books and holds six patents. Most recently, he co-authored the introductory textbook, “High Performance Computing”, published by Morgan-Kaufmann in 2018 which is going into 2nd edition.