Berkeley Lab’s John Shalf Ponders the Future of HPC Architectures

By Kathy Kincade

June 27, 2019

Editor’s note: Ahead of John Shalf’s well-attended and well-received “high-bandwidth” keynote at ISC 2019, Shalf discussed the talk’s major themes in an interview with Berkeley Lab’s Kathy Kincade. 

What will scientific computing at scale look like in 2030? With the impending demise of Moore’s Law, there are still more questions than answers for users and manufacturers of HPC technologies as they try to figure out what their next best investments should be. As he prepared to head to ISC19 in Frankfurt, Germany, to give a keynote address on the topic, John Shalf – who leads the Computer Science Department in Lawrence Berkeley National Laboratory’s Computational Research Division – shared his thoughts on what the future holds for computing technologies and architectures in the era beyond exascale. ISC took place June 16-20; Shalf’s keynote was on Tuesday, June 18.

What was the focus of your keynote at ISC?

What the landscape of computing, in general, is going to look like after the end of Moore’s Law. We’ve come to depend on Moore’s Law and to really expect that every generation of chips will double the speed, performance, and efficiency of the previous generation. Exascale will be the last iteration of Moore’s Law before the bottom drops out – and the question then is, how do we continue? Is exascale the last of its kind, or are we going to embark on a first-of-its-kind machine for the future of computing?

How long have you been thinking/talking about what’s next for HPC after Moore’s Law?

Where we are now is really the second shoe dropping. I got involved in the Exascale Computing Initiative discussions back in 2008, but actually, my interest in this predates exascale. Back in 2005, David Patterson’s group at UC Berkeley was talking about it in the Parallel Computing Laboratory, and we spent two years there in discussion and debate about the end of  Dennard’s scaling. Ultimately we published “The Landscape of Parallel Computing Research: A View from Berkeley,” which was the prediction that parallel computing would become ubiquitous on account of clock frequencies no longer scaling at exponential rates. This was followed closely by the DARPA 2008 Exascale report that set the stage for the Exascale Initiative for HPC. So the end of Dennard’s scaling was the first shoe to drop, but we always knew that the second shoe would drop fairly soon after the first. And the second shoe dropping means we can’t shrink transistors at all anymore, and that is the real end of Moore’s Law. Exascale is addressing the mass parallelism from the first shoe dropping, and I’ve been concerned about the second shoe dropping during the entire 10-year ramp-up to the Exascale Computing Initiative and subsequent Project, as were many others who were involved in writing the View from Berkeley report and the DARPA 2008 report.

How is the slowing of Moore’s Law already affecting HPC technologies and the industry itself?

We are seeing already procurement cycles stretching out so that the replacement of machines is happening at a slower pace than it has historically. Eric Strohmaier at Berkeley Lab has been tracking the replacement rate on the TOP500 very closely, and he has seen a noticeable slowdown in system replacement rates. I’ve also heard from our colleagues in industry that this is a troubling development that will affect their business model in the future. But we are also seeing these effects in the mega datacenter space, such as Google, Facebook, and Amazon. Google has actually taken to designing its own chips, specialized for particular parts of their workflow, such as the Tensor Processing Unit (TPU). We will probably see even more specialization in the future, but how this applies to HPC is less clear at this point – and that’s what I would like to get people thinking about during my keynote.

Is the lithography industry experiencing a parallel paradigm shift?

Yes, the lithography industry is also being affected, and something’s going to need to change in the economics for that industry. What we have seen in the past decade is that we’ve gone from nearly a dozen leading-edge fabs down to two. Global Foundries recently dropped out as a leading-edge fab, and Intel has had a huge amount of trouble getting its 10nm fab line off the ground. So clearly there are huge tectonic shifts happening in the lithography market as we speak, and how that will resolve itself ultimately remains unclear.

Do we have to start imagining an entirely new computing technology development and production process?

I think the way in which we select and procure systems is going to have to be revisited. While using user application codes to run benchmarks to assess the performance and usability of emerging systems is a great way for us to select systems today that use general purpose processors, it doesn’t seem to be a very good approach for selecting systems that might have specialized features for science. In the future, we need to be more closely involved in the design of the machines with our suppliers to deliver machines that are truly effective for scientific workloads. This is as much about sustainable economic models as it is a change in the design process. The most conventional or even the most technologically elegant solution might not survive, but the one that makes a lot of money will. But our current economic model is breaking.

Looking ahead, I see three paths going forward. The first is specialization and better packaging – specialization meaning designing a machine for a targeted class of applications. This has already been demonstrated in the successful case of the Google TPU, for example. So that is the most immediate path forward.

Another potential path forward is new transistor technology that replaces CMOS that is much more energy efficient and scalable. However, we know from past experience that it takes about 10 years to get from a lab demo to a production product. There are promising candidates, but no clear replacements demonstrated in the lab, which means we are already 10 years too late for that approach to be adopted by time Moore’s Law fails. We need to dramatically accelerate the discovery process in that area through a much more comprehensive materials-to-systems co-design process.

The third approach is to explore alternative models of computation such as quantum and neuromorphic and other, related approaches. These are all fantastic, but they are really expanding computing into areas where digital computing performs very poorly. They aren’t necessarily replacement technologies for digital general purpose computing; they are merely expanding into areas where digital isn’t very effective to start with. So I think these are worthy investments, but they aren’t the replacement technology. They will have a place, but how broadly applicable they will be is still being explored.

What about the development of new chip materials – what role might they play in the future of HPC architectures?

New materials are definitely part of the CMOS replacement. It’s not just new materials; fundamental breakthroughs in solid-state physics will be required to create a suitable CMOS replacement. The fundamental principle of operation for existing transistor technology cannot be substantially improved beyond what we see today. So to truly realize a CMOS replacement will require a new physical principle for switching, whether electrical, optical, or magnetic switching. A fundamentally new physical principle will need to be discovered and that, in turn, will require new materials and new material interfaces to realize effective and manufacturable solutions.

Are there any positives when you look at what is happening in this field right now?

Yes, definitely there are positives. We believe the co-design processor is going to require not just software and hardware people to collaborate, it is going to require this collaboration to go all the way down into the materials and materials physics level. And for the national laboratories, this is a great opportunity for us to work closely with our colleagues in the materials science divisions of our respective laboratories. I work at a national laboratory because I’m excited by cross-disciplinary collaboration, and clearly, that is the only way we are going to make forward progress in this area. The recent ASCR Extreme Heterogeneity and DOE Microelectronics BRNs show strong interest by DOE in this deep co-design and collaborative research that is really needed in this space. So to that extent, it is kind of an exciting time.

When you think about the future of HPC and supercomputing architectures and technologies, what do you imagine they will look like 10 years from now?

I think we’re going to have smaller machines that are more effective for the workflows they target. For three decades we have become used to ever-growing, larger and larger machines, but that doesn’t seem to be the winning approach for creating effective science in the post-exascale and post-Moore era.


About the Author

Kathy Kincade is a science & technology writer and editor with the Berkeley Lab Computing Sciences Communications Group.


Article courtesy Berkeley Lab; Feature image credit: ISC High Performance.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

The Case for an Edge-Driven Future for Supercomputing

September 24, 2021

“Exascale only becomes valuable when it’s creating and using data that we care about,” said Pete Beckman, co-director of the Northwestern-Argonne Institute of Science and Engineering (NAISE), at the most recent HPC Read more…

Three Universities Team for NSF-Funded ‘ACES’ Reconfigurable Supercomputer Prototype

September 23, 2021

As Moore’s law slows, HPC developers are increasingly looking for speed gains in specialized code and specialized hardware – but this specialization, in turn, can make testing and deploying code trickier than ever. Now, researchers from Texas A&M University, the University of Illinois at Urbana... Read more…

Qubit Stream: Monte Carlo Advance, Infosys Joins the Fray, D-Wave Meeting Plans, and More

September 23, 2021

It seems the stream of quantum computing reports never ceases. This week – IonQ and Goldman Sachs tackle Monte Carlo on quantum hardware, Cambridge Quantum pushes chemistry calculations forward, D-Wave prepares for its Read more…

Asetek Announces It Is Exiting HPC to Protect Future Profitability

September 22, 2021

Liquid cooling specialist Asetek, well-known in HPC circles for its direct-to-chip cooling technology that is inside some of the fastest supercomputers in the world, announced today that it is exiting the HPC space amid multiple supply chain issues related to the pandemic. Although pandemic supply chain... Read more…

TACC Supercomputer Delves Into Protein Interactions

September 22, 2021

Adenosine triphosphate (ATP) is a compound used to funnel energy from mitochondria to other parts of the cell, enabling energy-driven functions like muscle contractions. For ATP to flow, though, the interaction between the hexokinase-II (HKII) enzyme and the proteins found in a specific channel on the mitochondria’s outer membrane. Now, simulations conducted on supercomputers at the Texas Advanced Computing Center (TACC) have simulated... Read more…

AWS Solution Channel

Introducing AWS ParallelCluster 3

Running HPC workloads, like computational fluid dynamics (CFD), molecular dynamics, or weather forecasting typically involves a lot of moving parts. You need a hundreds or thousands of compute cores, a job scheduler for keeping them fed, a shared file system that’s tuned for throughput or IOPS (or both), loads of libraries, a fast network, and a head node to make sense of all this. Read more…

The Latest MLPerf Inference Results: Nvidia GPUs Hold Sway but Here Come CPUs and Intel

September 22, 2021

The latest round of MLPerf inference benchmark (v 1.1) results was released today and Nvidia again dominated, sweeping the top spots in the closed (apples-to-apples) datacenter and edge categories. Perhaps more interesti Read more…

The Case for an Edge-Driven Future for Supercomputing

September 24, 2021

“Exascale only becomes valuable when it’s creating and using data that we care about,” said Pete Beckman, co-director of the Northwestern-Argonne Institut Read more…

Three Universities Team for NSF-Funded ‘ACES’ Reconfigurable Supercomputer Prototype

September 23, 2021

As Moore’s law slows, HPC developers are increasingly looking for speed gains in specialized code and specialized hardware – but this specialization, in turn, can make testing and deploying code trickier than ever. Now, researchers from Texas A&M University, the University of Illinois at Urbana... Read more…

Qubit Stream: Monte Carlo Advance, Infosys Joins the Fray, D-Wave Meeting Plans, and More

September 23, 2021

It seems the stream of quantum computing reports never ceases. This week – IonQ and Goldman Sachs tackle Monte Carlo on quantum hardware, Cambridge Quantum pu Read more…

Asetek Announces It Is Exiting HPC to Protect Future Profitability

September 22, 2021

Liquid cooling specialist Asetek, well-known in HPC circles for its direct-to-chip cooling technology that is inside some of the fastest supercomputers in the world, announced today that it is exiting the HPC space amid multiple supply chain issues related to the pandemic. Although pandemic supply chain... Read more…

TACC Supercomputer Delves Into Protein Interactions

September 22, 2021

Adenosine triphosphate (ATP) is a compound used to funnel energy from mitochondria to other parts of the cell, enabling energy-driven functions like muscle contractions. For ATP to flow, though, the interaction between the hexokinase-II (HKII) enzyme and the proteins found in a specific channel on the mitochondria’s outer membrane. Now, simulations conducted on supercomputers at the Texas Advanced Computing Center (TACC) have simulated... Read more…

The Latest MLPerf Inference Results: Nvidia GPUs Hold Sway but Here Come CPUs and Intel

September 22, 2021

The latest round of MLPerf inference benchmark (v 1.1) results was released today and Nvidia again dominated, sweeping the top spots in the closed (apples-to-ap Read more…

Why HPC Storage Matters More Now Than Ever: Analyst Q&A

September 17, 2021

With soaring data volumes and insatiable computing driving nearly every facet of economic, social and scientific progress, data storage is seizing the spotlight. Hyperion Research analyst and noted storage expert Mark Nossokoff looks at key storage trends in the context of the evolving HPC (and AI) landscape... Read more…

GigaIO Gets $14.7M in Series B Funding to Expand Its Composable Fabric Technology to Customers

September 16, 2021

Just before the COVID-19 pandemic began in March 2020, GigaIO introduced its Universal Composable Fabric technology, which allows enterprises to bring together Read more…

Ahead of ‘Dojo,’ Tesla Reveals Its Massive Precursor Supercomputer

June 22, 2021

In spring 2019, Tesla made cryptic reference to a project called Dojo, a “super-powerful training computer” for video data processing. Then, in summer 2020, Tesla CEO Elon Musk tweeted: “Tesla is developing a [neural network] training computer called Dojo to process truly vast amounts of video data. It’s a beast! … A truly useful exaflop at de facto FP32.” Read more…

Enter Dojo: Tesla Reveals Design for Modular Supercomputer & D1 Chip

August 20, 2021

Two months ago, Tesla revealed a massive GPU cluster that it said was “roughly the number five supercomputer in the world,” and which was just a precursor to Tesla’s real supercomputing moonshot: the long-rumored, little-detailed Dojo system. “We’ve been scaling our neural network training compute dramatically over the last few years,” said Milan Kovac, Tesla’s director of autopilot engineering. Read more…

Esperanto, Silicon in Hand, Champions the Efficiency of Its 1,092-Core RISC-V Chip

August 27, 2021

Esperanto Technologies made waves last December when it announced ET-SoC-1, a new RISC-V-based chip aimed at machine learning that packed nearly 1,100 cores onto a package small enough to fit six times over on a single PCIe card. Now, Esperanto is back, silicon in-hand and taking aim... Read more…

CentOS Replacement Rocky Linux Is Now in GA and Under Independent Control

June 21, 2021

The Rocky Enterprise Software Foundation (RESF) is announcing the general availability of Rocky Linux, release 8.4, designed as a drop-in replacement for the soon-to-be discontinued CentOS. The GA release is launching six-and-a-half months after Red Hat deprecated its support for the widely popular, free CentOS server operating system. The Rocky Linux development effort... Read more…

Intel Completes LLVM Adoption; Will End Updates to Classic C/C++ Compilers in Future

August 10, 2021

Intel reported in a blog this week that its adoption of the open source LLVM architecture for Intel’s C/C++ compiler is complete. The transition is part of In Read more…

Hot Chips: Here Come the DPUs and IPUs from Arm, Nvidia and Intel

August 25, 2021

The emergence of data processing units (DPU) and infrastructure processing units (IPU) as potentially important pieces in cloud and datacenter architectures was Read more…

AMD-Xilinx Deal Gains UK, EU Approvals — China’s Decision Still Pending

July 1, 2021

AMD’s planned acquisition of FPGA maker Xilinx is now in the hands of Chinese regulators after needed antitrust approvals for the $35 billion deal were receiv Read more…

Google Launches TPU v4 AI Chips

May 20, 2021

Google CEO Sundar Pichai spoke for only one minute and 42 seconds about the company’s latest TPU v4 Tensor Processing Units during his keynote at the Google I Read more…

Leading Solution Providers

Contributors

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

HPE Wins $2B GreenLake HPC-as-a-Service Deal with NSA

September 1, 2021

In the heated, oft-contentious, government IT space, HPE has won a massive $2 billion contract to provide HPC and AI services to the United States’ National Security Agency (NSA). Following on the heels of the now-canceled $10 billion JEDI contract (reissued as JWCC) and a $10 billion... Read more…

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

Quantum Roundup: IBM, Rigetti, Phasecraft, Oxford QC, China, and More

July 13, 2021

IBM yesterday announced a proof for a quantum ML algorithm. A week ago, it unveiled a new topology for its quantum processors. Last Friday, the Technical Univer Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

Frontier to Meet 20MW Exascale Power Target Set by DARPA in 2008

July 14, 2021

After more than a decade of planning, the United States’ first exascale computer, Frontier, is set to arrive at Oak Ridge National Laboratory (ORNL) later this year. Crossing this “1,000x” horizon required overcoming four major challenges: power demand, reliability, extreme parallelism and data movement. Read more…

Intel Unveils New Node Names; Sapphire Rapids Is Now an ‘Intel 7’ CPU

July 27, 2021

What's a preeminent chip company to do when its process node technology lags the competition by (roughly) one generation, but outmoded naming conventions make it seem like it's two nodes behind? For Intel, the response was to change how it refers to its nodes with the aim of better reflecting its positioning within the leadership semiconductor manufacturing space. Intel revealed its new node nomenclature, and... Read more…

Top500: Fugaku Still on Top; Perlmutter Debuts at #5

June 28, 2021

The 57th Top500, revealed today from the ISC 2021 digital event, showcases many of the same systems as the previous edition, with Fugaku holding its significant lead and only one new entrant in the top 10 cohort: the Perlmutter system at the DOE Lawrence Berkeley National Laboratory enters the list at number five with 65.69 Linpack petaflops. Perlmutter is the largest... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire