Revisiting the 2008 Exascale Computing Study at SC18

By Scott Gibson

November 29, 2018

Jeffrey Vetter, Distinguished R&D Staff Member at Oak Ridge National Laboratory, led the SC18 Birds of a Feather session “Revisiting the 2008 ExaScale Computing Study and Venturing Predictions for 2028.”

A report published a decade ago conveyed the results of a study aimed at determining if it were possible to achieve 1000X the computational power of the then-emerging petascale systems at a system power of no more than 20 MW. On November 14 at the SC18 supercomputing conference in Dallas, some of the original contributors to the report participated in a Birds of a Feather session in which they reflected on the document, sharing what they deemed to be its hits and misses and making predictions for 2028.

Session leader, Jeffrey Vetter of Oak Ridge National Laboratory, said the 2008 report, titled “Exascale Computing Study: Technology Challenges in Achieving Exascale Systems,” has been cited more than 1,000 times and that many people look to it to understand what research agendas they should undertake and to consider what are the most salient challenges to be faced in high-performance computing.

The study was sponsored by the Defense Advanced Research Projects Agency (DARPA) Information Processing Techniques Office (IPTO) with Bill Harrod as program manager. The report represents the ideas of people from universities, industry, and research labs collected during periodic meetings conducted during the course of more than a year.

Harrod, who is now program manager for the Intelligence Advanced Research Projects Activity (IARPA), told the BoF audience that consideration of petascale system specifications as they existed at the time informed the study group members’ assumptions about exascale. Petascale systems operated at about 13 MW with several hundred cabinets. Thus, the anticipated parameters for exascale were 1018 operations/second at 20 MW and with fewer than 500 cabinets. The pivotal big-picture questions, Harrod said, were whether an exascale system was needed and could it be used for scientific discovery and other practical purposes.

Two other studies, on software and resiliency, respectively, followed the study upon which the 2008 report was based. The resounding, overarching comment concerning the findings of the three studies, Harrod said, was that co-design would be essential. He added that although the co-design concept was not revolutionary, it was determined to be critical for ensuring hardware design would correspond properly with the intended uses for the system, and it became an integral aspect of the US Department of Energy’s Exascale Computing Initiative (ECI) and Exascale Computing Project (ECP).

Peter Kogge of the University of Notre Dame led the Exascale Computing study and served as editor of the 2008 report. In his presentation for the BoF, he outlined four key challenges that surfaced from the study: energy and power, memory, concurrency, and resiliency. He also summarized the 2008 computing environment and what it was anticipated to look like by 2015, noting that the study team did not focus on application needs and the Roofline model. For matrix multiply like the High-Performance Linpack (HPL) benchmark, he said, having a large enough cache would supersede concerns about memory speed; and to reach a peak of 1 exaflops, the goal was to hit 20 pJ/flop.

The team assembled what Kogge referred to as an aggressive strawman with an architecture that was largely influenced by study contributor Bill Dally (then with Stanford University, now with Nvidia), who participated in the BoF. The architecture was characterized by multicore, no coherency, and shared global address space. Reaching the 1 exaflops peak meant 68 MW power usage from 583 racks. Relative to programming, about 1 billion threads needed to be maintained. A wire interconnect was assumed.

Kogge provided details from the report on the aggressive strawman system, which he said he considered to be “remarkably prescient” with respect to what ultimately materialized in the evolution toward exascale.

A 2015 paper for the International Supercomputing Conference (ISC) by Kogge titled “Updating Energy Model for Future Exascale Systems” examined an update of the models that the Exascale Computing study team had built to project performance for only the heavyweight (Xeon chips) sockets. The paper received a Gauss Award.

The study group’s final analysis showed that an exaflops could be reached by 2020, but with a peak of 180 MW to 430 MW.

The Study Contributors’ Assessments of Hits and Misses

Bill Harrod

At the inception of the DARPA studies, the target year for reaching exascale was 2015, but based on the results of the software study it was adjusted to 2018. Today, projections are focused on the 2021–2023 time frame. Harrod said that although the projections have evolved, the studies paved the way for DARPA’s Ubiquitous High-Performance Computing (UHPC) Exascale Projects and laid the foundation for DOE’s ECI and ECP. They have, he added, greatly enhanced the environment for exascale development.

In terms of hits and misses, the importance of co-design has played out at DOE and many other places, including the FastForward and PathForward programs, Harrod said. As a key miss of the study, he highlighted the fact that it did not foresee the impact of artificial intelligence (AI).

Peter Kogge

The study group’s approach in focusing on the heavyweight systems was dead-on through 2015, and the aggressive strawman they developed greatly resembles today’s GPU, Kogge said. In addition, he said the study group was right to point out that some form of memory stacking would be necessary, and that interconnects, at least locally within racks, would still largely be copper. Among the misses, he highlighted the heterogeneous systems and the SIMT threading model, which constitutes what is done with GPUs today.

Keren Bergman (Columbia University)

Bergman said that as someone whose background is in optical networks, she considered the close examination of the energy consumption of the interconnects in this study to be enlightening. With respect to the study’s hits, she opined that the deep discussions captured the growing challenge of data movement. However, in her view, one of the study’s sizable misses was the cost associated with manufacturability. She said substantial innovations would be required to integrate photonics into chips and remedy one of the last real bottlenecks.

Dean Klein (Micron/now retired)

Klein, who was vice president of memory system development at Micron at the time of the study and today in retirement mentors and motivates engineering students, highlighted as a hit the study group’s awareness that the energy of memory subsystems would drive compromises in the memory in systems, and as a miss the idea of NAND flash playing a role in supercomputing.

Bill Dally

The prescience of the study’s aggressive silicon strawman made it a hit, Dally said. Conversely, he viewed as shortcomings the paucity of capable networks due to funding, failure to anticipate AI, and an overly conservative approach in addressing software.

Exascale Study Contributors’ Predictions for 2028

The belief that complementary metal-oxide-semiconductor (CMOS) technology for constructing integrated circuits would remain predominant was a recurring notion, as the BoF contributors offered diverse predictions for 2028 based on the perspectives of their areas of expertise.

The contributors also responded to comments and questions from the audience.

Scott Gibson is a science writer and communications specialist with Oak Ridge National Laboratory.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

ARM, Fujitsu Targeting Open-source Software for Power Efficiency in 2-nm Chip

July 19, 2024

Fujitsu and ARM are relying on open-source software to bring power efficiency to an air-cooled supercomputing chip that will ship in 2027. Monaka chip, which will be made using the 2-nanometer process, is based on the Read more…

SCALEing the CUDA Castle

July 18, 2024

In a previous article, HPCwire has reported on a way in which AMD can get across the CUDA moat that protects the Nvidia CUDA castle (at least for PyTorch AI projects.). Other tools have joined the CUDA castle siege. AMD Read more…

Quantum Watchers – Terrific Interview with Caltech’s John Preskill by CERN

July 17, 2024

In case you missed it, there's a fascinating interview with John Preskill, the prominent Caltech physicist and pioneering quantum computing researcher that was recently posted by CERN’s department of experimental physi Read more…

Aurora AI-Driven Atmosphere Model is 5,000x Faster Than Traditional Systems

July 16, 2024

While the onset of human-driven climate change brings with it many horrors, the increase in the frequency and strength of storms poses an enormous threat to communities across the globe. As climate change is warming ocea Read more…

Researchers Say Memory Bandwidth and NVLink Speeds in Hopper Not So Simple

July 15, 2024

Researchers measured the real-world bandwidth of Nvidia's Grace Hopper superchip, with the chip-to-chip interconnect results falling well short of theoretical claims. A paper published on July 10 by researchers in the U. Read more…

Belt-Tightening in Store for Most Federal FY25 Science Budets

July 15, 2024

If it’s summer, it’s federal budgeting time, not to mention an election year as well. There’s an excellent summary of the curent state of FY25 efforts reported in AIP’s policy FYI: Science Policy News. Belt-tight Read more…

SCALEing the CUDA Castle

July 18, 2024

In a previous article, HPCwire has reported on a way in which AMD can get across the CUDA moat that protects the Nvidia CUDA castle (at least for PyTorch AI pro Read more…

Aurora AI-Driven Atmosphere Model is 5,000x Faster Than Traditional Systems

July 16, 2024

While the onset of human-driven climate change brings with it many horrors, the increase in the frequency and strength of storms poses an enormous threat to com Read more…

Shutterstock 1886124835

Researchers Say Memory Bandwidth and NVLink Speeds in Hopper Not So Simple

July 15, 2024

Researchers measured the real-world bandwidth of Nvidia's Grace Hopper superchip, with the chip-to-chip interconnect results falling well short of theoretical c Read more…

Shutterstock 2203611339

NSF Issues Next Solicitation and More Detail on National Quantum Virtual Laboratory

July 10, 2024

After percolating for roughly a year, NSF has issued the next solicitation for the National Quantum Virtual Lab program — this one focused on design and imple Read more…

NCSA’s SEAS Team Keeps APACE of AlphaFold2

July 9, 2024

High-performance computing (HPC) can often be challenging for researchers to use because it requires expertise in working with large datasets, scaling the softw Read more…

Anders Jensen on Europe’s Plan for AI-optimized Supercomputers, Welcoming the UK, and More

July 8, 2024

The recent ISC24 conference in Hamburg showcased LUMI and other leadership-class supercomputers co-funded by the EuroHPC Joint Undertaking (JU), including three Read more…

Generative AI to Account for 1.5% of World’s Power Consumption by 2029

July 8, 2024

Generative AI will take on a larger chunk of the world's power consumption to keep up with the hefty hardware requirements to run applications. "AI chips repres Read more…

US Senators Propose $32 Billion in Annual AI Spending, but Critics Remain Unconvinced

July 5, 2024

Senate leader, Chuck Schumer, and three colleagues want the US government to spend at least $32 billion annually by 2026 for non-defense related AI systems.  T Read more…

Atos Outlines Plans to Get Acquired, and a Path Forward

May 21, 2024

Atos – via its subsidiary Eviden – is the second major supercomputer maker outside of HPE, while others have largely dropped out. The lack of integrators and Atos' financial turmoil have the HPC market worried. If Atos goes under, HPE will be the only major option for building large-scale systems. Read more…

Everyone Except Nvidia Forms Ultra Accelerator Link (UALink) Consortium

May 30, 2024

Consider the GPU. An island of SIMD greatness that makes light work of matrix math. Originally designed to rapidly paint dots on a computer monitor, it was then Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock_1687123447

Nvidia Economics: Make $5-$7 for Every $1 Spent on GPUs

June 30, 2024

Nvidia is saying that companies could make $5 to $7 for every $1 invested in GPUs over a four-year period. Customers are investing billions in new Nvidia hardwa Read more…

Nvidia Shipped 3.76 Million Data-center GPUs in 2023, According to Study

June 10, 2024

Nvidia had an explosive 2023 in data-center GPU shipments, which totaled roughly 3.76 million units, according to a study conducted by semiconductor analyst fir Read more…

AMD Clears Up Messy GPU Roadmap, Upgrades Chips Annually

June 3, 2024

In the world of AI, there's a desperate search for an alternative to Nvidia's GPUs, and AMD is stepping up to the plate. AMD detailed its updated GPU roadmap, w Read more…

Some Reasons Why Aurora Didn’t Take First Place in the Top500 List

May 15, 2024

The makers of the Aurora supercomputer, which is housed at the Argonne National Laboratory, gave some reasons why the system didn't make the top spot on the Top Read more…

Intel’s Next-gen Falcon Shores Coming Out in Late 2025 

April 30, 2024

It's a long wait for customers hanging on for Intel's next-generation GPU, Falcon Shores, which will be released in late 2025.  "Then we have a rich, a very Read more…

Leading Solution Providers

Contributors

Google Announces Sixth-generation AI Chip, a TPU Called Trillium

May 17, 2024

On Tuesday May 14th, Google announced its sixth-generation TPU (tensor processing unit) called Trillium.  The chip, essentially a TPU v6, is the company's l Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

IonQ Plots Path to Commercial (Quantum) Advantage

July 2, 2024

IonQ, the trapped ion quantum computing specialist, delivered a progress report last week firming up 2024/25 product goals and reviewing its technology roadmap. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

The NASA Black Hole Plunge

May 7, 2024

We have all thought about it. No one has done it, but now, thanks to HPC, we see what it looks like. Hold on to your feet because NASA has released videos of wh Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire