Dec. 5 — The Blue Waters supercomputer was the driving petaflop power behind “Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided,” awarded best technical paper at SC13 on Nov. 21 in Denver, Co.
This paper, written by Robert Gerstenberger, University of Illinois at Urbana-Champaign, and Torsten Hoefler (formally of NCSA) and Maciej Besta, both of ETH Zürich, is a product of a long-term effort by the message passing interface (MPI) community to standardize high-performance communications on today’s networks, most of which offer remote direct memory access (RDMA) features. The team proved the scalability and practicability of the MPI-3 programming interface for exploiting those RDMA networks directly and efficiently for the benefit of the HPC user.
During their nearly full-scale run on Blue Waters, they experienced significant performance improvements in both communications and synchronization. For example, the typical number of instructions needed in a single communication can range from anywhere in the hundreds up to the thousands; however, they were able to reduce the number to 78. This communication improvement also meant an improvement in overheads, down to 20 nanoseconds from the typical 100-500. And the successes didn’t end there; on the synchronization side, the group designed protocols to support scaling to millions of cores, enabling access to extremely large systems for the user.
“These results change your way of thinking. It changes your way of programming. Which is very important in the sense that we can now actually distinguish between communication and synchronization; subsequently, the user has those two tools separately at their disposal,” says co-author Torsten Hoefler, assistant professor of Computer Science at ETH Zürich and formally leader of the modeling and simulations efforts with the Blue Waters project at NCSA.
Armed with this new information, the user has the ability to go in and change the application at the critical parts while leaving everything else intact. And being able to stay within the same programming model like that, allows the user to easily port existing code to a new mode.
“So that is one rather powerful motive moving all applications forward. There are users moving towards low-level DMAP implementations on Blue Waters, and those users could just go to a standardized interface and then have the portability benefits,” continues Hoefler. “We are now enabling a portable mode of high-performance scalable programming. That is the big benefit.”
The full paper can be found in the ACM Digital Library at http://dl.acm.org/citation.cfm?doid=2503210.2503286.