The timeline by which supercomputing advances has been pretty consistent, with thousand-fold increases occurring roughly every decade. To illustrate, Sandia Lab’s ASCI Red became the first teraflop supercomputer in 1996, and Los Alamos Lab’s RoadRunner broke the petaflop barrier in 2008. If this trend continues, we should see an exascale machine by the end of 2020. And most experts agree with this timeframe.
Peter Kogge, however, remains skeptical. Kogge, an IEEE Fellow and professor of computer science and engineering at the University of Notre Dame, recently shared his thoughts on the subject in a Scientific Computing article. Kogge predicts an end to the “spectacular progress” supercomputing has enjoyed in the past.
He argues that the “power wall” will make Moore’s Law-predicted speed increases unsustainable. Chips will still get faster, but not as quickly.
In 2007, Kogge and a body of experts came together at the behest of DARPA to create a 278-page report [PDF] that examined the feasibility of building an exaflop-class supercomputer by 2015. The agency asked the group to determine the key challenges as well as the engineering technologies that would be necessary to build such a machine.
Kogge reports on the sobering conclusions: “The practical exaflops-class supercomputer DARPA was hoping for just wasn’t going to be attainable by 2015. In fact, it might not be possible anytime in the foreseeable future. Think of it this way: The party isn’t exactly over, but the police have arrived, and the music has been turned way down.”
The biggest obstacle to this next level of computing prowess? Power. Kogge uses the Blue Waters supercomputer as an example. Blue Waters will require 15 MW for 10 petaflops of power. If you were to create an exascale machine by scaling Blue Waters 100-fold, it would take 1.5 gigawatts of power to run it. That’s more than 0.1 percent of the total US power grid, states Kogge.
The DARPA report panel members reached the conclusion that it would not be feasible to build an exaflop-level supercomputer by merely tweaking the current computing technology. Only a complete redesign could achieve the necessary power savings.
The power obstacle is just the first of many “seemingly insurmountable obstacles.” There are also concerns about memory, long-term storage, and system resiliancy, not to mention the software problem — getting the code to run on so many cores. And to make matters worse, Kogge explains that many of the proposed solutions would require additional hardware, futher increasing the power demand.
Kogge is not one to point out all the problems without offering solutions. He writes that “success in assembling such a machine will demand a coordinated cross-disciplinary effort carried out over a decade or more, during which time device engineers and computer designers will have to work together to find the right combination of processing circuitry, memory structures, and communications conduits — something that can beat what are normally voracious power requirements down to manageable levels.”
He himself is working on developing new memory technologies that reduce the energy required by the data fetching process by bringing the data to the computation instead of having to move copies of the data around repeatedly.
The findings in the report definitely shed light on exascale’s pain points, but by doing so, they also illuminate the path to progress.
And even more importantly, Kogge believes, “government funding agencies now realize the difficulties involved and are working hard to jump-start this kind of research.”