Getting to Exascale
As the exascale barrier draws ever closer, experts around the world turn their attention to enabling this major advance. Providing a truly deep dive into the subject matter is the Harvard School of Engineering and Applied Science. The institution’s summer 2014 issue of “Topics” takes a hard look at the way that supercomputing is progressing.
In the feature article “Built for Speed: Designing for exascale computers,” Brian Hayes considers all of the remarkable science that will be enabled if only the computer is fast enough.
Hayes explains that the field of hemodymamics is poised for a breakthrough, where a surgeon would be able to perform a detailed simulation of blood flow in a patient’s arteries in order to pinpoint the best repair strategy. Currently, however, simulating just one second of blood flow takes about five hours on even the fastest supercomputer. To have a truly transformative effect on medicine, scientists and practitioners need computers that are one-thousand times faster than the current crop.
Getting to this next stage in computing is high up on the list of priorities of SEAS. Hayes writes that science and engineering groups in the school are contributing to software and hardware projects to support this goal while researchers in domains such as climatology, materials science, molecular biology, and astrophysics are gearing up to use such powerful resources.
From here, Hayes details the numerous challenges that make exascale a more onerous challenge than previous 1000x milestones. For a while, chipmakers relied on increasing clock rates to drive performance gains, but this era is over.
“The speed limit for modern computers is now set by power consumption,” writes Hayes. “ If all other factors are held constant, the electricity needed to run a processor chip goes up as the cube of the clock rate: doubling the speed brings an eightfold increase in power demand.”
Shrinking transistors and putting multiple cores on each chip (multicore) has helped boost the total number of operations per second since about 2005. However, there is of course a fundamental limit as to how small the feature sizes can be before reliability becomes untenable.
From an architecture perspective, systems have gone from custom-built hardware in the 1980s to vanilla off-the-shelf components through the 1990s and 2000s. Now there is a swing back to specialized technologies again. The first petaflopper, Roadrunner, used a hybrid design with CPU working in tandem with specialized Cell BE coprocessors. Now most of the top supercomputers are based on a heterogenous architecture, using some combination of CPUs and accelerators/coprocessors.
The challenges are not just on the hardware side. Hanspeter Pfister, a Wang Professor of Computer Science and director of IACS who was interviewed by Hayes, believes getting to exascale will require fundamentally new programming models. Pfister points out that the LINPACK benchmark is the only program that can rate and rank machines at full speed. Other software may harness only 10 percent of the system’s potential. There are also issues with operating systems, file systems and middleware that connects databases and networks.
Pfister is also quite skeptical of the future of programming tools like MPI and CUDA. “We can’t be thinking about a billion cores in CUDA,” he says. “And when the next protocol emerges, I know in my heart it’s not going to be MPI. We’re beyond the human capacity for allocating and optimizing resources.”
Some believe that the only tenable solution to extreme-scale computing is getting the hardware and software folks in the same room. This approach, called “co-design” will help bridge the gap between what users want and what manufacturers can supply. The US Department of Energy has established three co-design centers to facilitate this kind of approach.
The US DOE originally intended to field an exascale machine sometime around 2018, but that timeline slipped due primarily to a lack of political will to fund the effort. Since then 2020 has been bandied about as a target, but that may also be overly optimistic. One data point in support of getting to exascale sooner rather than later is the need to conduct virtual nuclear testing in support of stockpile stewardship. This program alone, according to one expert interviewed for the piece, is enough to ensure that exascale machines are built. There are other applications that could also come to be regarded as critical for national security, for example climate modeling.