At the tail end of 2013, Congress passed a law directing the Department of Energy to develop exascale computing capability within the next decade in order to meet the objectives of the nuclear stockpile stewardship program. The directive is part of the 2014 National Defense Authorization Act, which President Obama signed into law on December 26, 2013.
“The Administrator for Nuclear Security shall develop and carry out a plan to develop exascale computing and incorporate such computing into the stockpile stewardship program under section 4201 of the Atomic Energy Defense Act (50 U.S.C. 2521) during the 10-year period beginning on the date of the enactment of this Act,” notes the relevant text from the law (H.R. 3304).
In November, a month before the law was passed, DOE officials laid out a detailed 10-year roadmap for exascale computing as part of the Advanced Scientific Computing Advisory Committee meeting in Denver. In addition to weapons research and simulation, exascale computing is considered critical for the processing of increasingly-large big data sets, including genomics, climate modeling and high-energy physics.
One of the group’s principle voices, William J. Harrod, Division Director for the Advanced Scientific Computing Research, Office of Science, explained that the DOE’s mission to push the frontiers of science and technology will require extreme-scale computing with machines that are 500 to 1,000 times more capable than today’s computers, albeit with a similar size and power footprint.
The DOE won’t be able to meet its goals using a business-as-usual evolutionary approach, Harrod noted as part of his Exascale Update, “rather it will require major novel advances in computing technology: exascale computing.”
The DOE Office of Science envisions an exascale computing system that is productive and based on marketable technology. The execution strategy proposed by the Office of Science outlines the research, development, applications, facilities, and integration necessary for deploying an exascale system in the early 2020s. In order to realize this goal, the Office is fostering partnerships with government, computer industry, DOE labs, academia, and the international research community.
As part of the meeting’s proceedings, the ASCAC exascale subcommittee released a study laying out the top 10 technical challenges facing exascale, reproduced below:
– Energy efficient circuit, power and cooling technologies.
– High performance interconnect technologies.
– Advanced memory technologies to dramatically improve capacity and bandwidth.
– Scalable system software that is power and resilience aware.
– Data management software that can handle the volume, velocity and diversity of data-storage
– Programming environments to express massive parallelism, data locality, and resilience.
– Reformulating science problems and refactoring solution algorithms for exascale.
– Ensuring correctness in the face of faults, reproducibility, and algorithm verification.
– Mathematical optimization and uncertainty quantification for discovery, design, and decision.
– Software engineering and supporting structures to enable scientific productivity.
At least some of these challenges are being addressed by projects like Fast Forward and Design Forward, which are funded by the DOE’s Office of Science and the National Nuclear Security Administration. Under these programs, two-year contracts totaling $62.4 million were awarded to AMD, IBM, Intel and NVIDIA in July 2012, with another $25.4 million in contracts going to the same companies plus Cray in the fall of 2013. Extreme scale software projects are also underway. These include 2012 X-Stack, Co-Design Centers, Mod/Sim, 2013 OS/R, and miscellaneous software, such as Exascale MPI.