With so much on the menu at SC with its exceptional program of technical papers, tutorials, research posters, and Birds-of-a-Feather (BOF) sessions, it’s difficult to choose the best part, but it’s safe to say that the Gordon Bell Prize is not just a highlight of SC, it’s one of the highest honors in HPC. Every year since 1987, an uber-talented group of finalists raises the bar on parallel computing by applying HPC to range of important science, engineering, and large-scale data analytics problems. Winners must demonstrate an outstanding achievement in one of three areas: peak performance, scalability and time-to-solution, or a special achievement. They are also asked to justify their entries with regard to their real-world benefit as well as their contribution to the broader HPC community.
The competition is funded by its namesake Gordon Bell, a pioneer in computer architecture, parallel processing and high performance computing, and this year five teams are contending for the coveted prize. In addition to the first-place $10,000 cash award, one runner-up will be selected for Honorable Mention. The Association for Computing Machinery’s (ACM) awards committee will announce the results at the 26th annual Supercomputing Conference (SC) awards ceremony less than a week away in New Orleans.
As a prelude to this well-attended session, here is an overview of the five accomplished teams, who are doing their part to advance parallel computing through new or specialized architectures, advances in algorithms and applications, and other optimizations that exploit the potential of large-scale systems.
The five papers/teams are:
- “Petascale High Order Dynamic Rupture Earthquake Simulations on Heterogeneous Supercomputers,” an international research project co-led by Michael Bader (Technische Universität München, Germany), Christian Pelties (Ludwig-Maximilians-Universität, Germany) and Alexander Heinecke (Intel, United States).
- “Physics-based urban earthquake simulation enhanced by 10.7 BlnDOF x30 K time-step unstructured FE non-linear seismic wave simulation,” from a Japanese research team, led by University of Tokyo’s Tsuyoshi Ichimura.
- “Real-time Scalable Cortical Computing at 46 Giga-Synaptic OPS/Watt with ~100× Speedup in Time-to-Solution and ~100,000× Reduction in Energy-to-Solution,” with research led by Dharmendra S. Modha, IBM Fellow and IBM Chief Scientist, Brain-inspired Computing, and additional team members from IBM and Cornell University.
- “Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics Supercomputer,” with lead researcher David E. Shaw, of DE Shaw Research, and team.
- “24.77 Pflops on a Gravitational Tree-Code to Simulate the Milky Way Galaxy with 18600 GPUs,” with research led by Simon Portegies Zwart and Jeroen Bédorf of the Netherland’s Leiden Observatory and team from SURFsara Amsterdam, the National Astronomical Observatory of Japan, RIKEN AICS, and the University of Tsukuba (Japan).
Each of these teams will be presenting their paper talks next week on Tuesday and Wednesday, in advance of the award announcement on Thursday.
The authors of “Petascale High Order Dynamic Rupture Earthquake Simulations on Heterogeneous Supercomputers” report achieving unprecedented earth model complexity on an Intel Xeon Phi platform (China’s Tianhe-2 supercomputer). They carried out architecture-aware optimizations to the SeisSol code that deliver up to 50 percent of peak performance. While SeisSol delivers near-optimal weak scaling, reaching 8.6 DP-PFLOPS on 8,192 nodes of the Tianhe-2 supercomputer, the team’s performance model projects reaching 18-20 DP-PFLOPS on the full Tianhe-2 machine. They anticipate this having real-world benefits for modern civil engineering.
The next entry is notable for its humanitarian bent. “Physics-based urban earthquake simulation enhanced by 10.7 BlnDOF x30 K time-step unstructured FE non-linear seismic wave simulation,” is on track to supporting earthquake response efforts. Intending to boost the reliability of urban earthquake response analyses, the team developed a hybrid seismic wave amplification simulation code, GAMERA. This unstructured 3-D finite-element-based MPI-OpenMP code was deployed on Japan’s K computer, where it was able to achieve a size-up efficiency of 87.1 percent using the entire machine. They also applied GAMERA to a physics-based urban earthquake response analysis for Tokyo. The team acknowledges this is still a very compute-intensive problem, but they say such analyses can improve the quality of disaster estimations.
For “Real-time Scalable Cortical Computing at 46 Giga-Synaptic OPS/Watt with ~100× Speedup in Time-to-Solution and ~100,000× Reduction in Energy-to-Solution,” IBM and Cornell University researchers united to develop a parallel, event-driven kernel for neurosynaptic computation, called TrueNorth. The brain-inspired neurosynaptic processor emphasizes efficiency of computation, memory, and communication. Its backers are targeting TrueNorth for a wide range of cognitive applications. They’ve already used a co-designed silicon expression of the kernel to run computer vision applications and complex recurrent neural network simulations.
The large D.E. Shaw Research team behind “Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics Supercomputer” report that the second-generation Anton 2 excels at performance, programmability, and capacity compared to its predecessor, Anton 1. Anton 2 is up to ten times faster than Anton 1 with the same number of nodes, and operates 180 times faster than any general-purpose hardware platform, according to the developers. The focus of the upgrade was enabling fine-grained event-driven operation, said to improve performance by increasing the overlap of computation with communication.
Last, but not least, the final paper, “24.77 Pflops on a Gravitational Tree-Code to Simulate the Milky Way Galaxy with 18600 GPUs,” shows the long-term evolution of the Milky Way Galaxy using 1,000 times more particles. Simulations were performed on two leadership-class machines, the Swiss Piz Daint supercomputer and the US ORNL Titan, using the N-body gravitational tree-code Bonsai. On Piz Daint, the 51 billion particle simulation achieved parallel efficiency of Bonsai above 95 percent, but the highest performance was achieved on Titan’s GPUs with a 242 billion particle Milky Way model. The Titan demo, which harnessed 18,600 GPUs, reached a sustained GPU performance of 33.49 petaflops and application performance of 24.77 petaflops.
Given the breadth and depth of these projects it is clear that the next winner of the Gordon Bell Prize next will join an elite list of past prize winners. Last year’s award went to the team responsible for “11 PFLOP/s Simulations of Cloud Cavitation Collapse,” by Diego Rossinelli, Babak Hejazialhosseini, Panagiotis Hadjidoukas and Petros Koumoutsakos, all of ETH Zurich; Costas Bekas and Alessandro Curioni of IBM Zurich Research Laboratory; Adam Bertsch and Scott Futral of Lawrence Livermore National Laboratory; and Steffen Schmidt and Nikolaus Adams of Technical University Munich.
In what IBM termed the “largest simulation ever in fluid dynamics,” the high throughput simulations of cloud cavitation collapse on 1.6 million cores of Sequoia reached 55 percent of its peak performance, corresponding to 11 petaflops. (This later rose to 14.4 petaflops sustained performance.) According to the authors, “the software successfully addresses the challenges that hinder the effective solution of complex flows on contemporary supercomputers, such as limited memory bandwidth, I/O bandwidth and storage capacity.” By boosting the quantitative prediction of cavitation, the breakthrough fluid dynamics simulations can help improve the design of high pressure fuel injectors and propellers and boost the performance of water purification systems and kidney lithotripsy. There is also an emerging therapeutic modality for cancer treatment. The paper is published in the Proceedings of SC’13.