Has it really been 25 years since the Message Passing Interface standard was born? It has indeed, and at this year’s EuroMPI meeting in September in Chicago, a “birthday” symposium will be held to celebrate the occasion. Speakers from the remote past of MPI, the middle years, and the current time will touch on the ideas that have given MPI its long life and will highlight the impact the standard has had on multiple aspects of parallel computing, from applications to libraries to its multiple implementations.
The concept of a standard for message passing emerged over time. While assorted systems, both commercial and free, competed for “mind share” and commercial success, a small meeting of researchers took place in 1991 at a conference in Oberlech, Austria. There Jack Dongarra, Rolf Hempel, Tony Hey, and David Walker drafted a white paper outlining a proposal for what a standard might look like, borrowing heavily from Marc Snir’s work at IBM. Jack Dongarra, Professor of Computer Science at the University of Tennessee, recalls, “Each of the existing systems had merit, but none had everything needed to move application development forward. We decided to instigate a community effort to address the problem.” It seems reasonable to affix the label “Birth of MPI” to the resulting workshop entitled “Standards for Message Passing in a Distributed Memory Environment” organized by Jack Dongarra and David Walker with funding from the Ken Kennedy Center for Research in Parallel Computation at Rice University in April 1992. That was the first time a wide variety of interested stakeholders gathered in an open meeting dedicated to the topic of a standard for message passing, forecasting the openness of the process that would follow. The result of that workshop, which featured presentations on multiple vendor-specific and portable systems, was a realization that a great diversity of good ideas existed among then-current message-passing libraries but that the lack of a standard was impeding the progress of parallel computing.
At the Supercomputing ’92 conference in November, a committee was formed to define a message-passing standard. At the time of creation, no one knew what the outcome might look like, but the effort was begun with the following objectives: (1) to define a portable standard for message-passing, which would not be an official, ANSI-like standard but would attract both implementers and users; (2) to operate in a completely open way, allowing anyone to join the discussions, either by attending meetings in person or by monitoring open email discussions; and (3) to be finished in one year.
The MPI effort was a lively one, as a result of the tensions among these three objectives. The committee decided to follow the format used by the High-Performance Fortran Forum, whose procedures had been well received by its community. (It even decided to meet in the same hotel in North Dallas.) An early decision of the MPI Forum was to not adopt any existing system or proposal as a starting but to start from scratch, with the explicit goals of portability, expressiveness, and performance capability. “Ease of use” was not a primary goal; the idea was that libraries, compilers, and other software layers would provide this aspect of parallel programming, and that applications would rely on their implementations over MPI to provide convenience of programming.
More formal meetings began in January 1993 under the name “MPI Forum,” an extension of the SC ’92 committee, and continued until the following February. Over that time, more than 60 people from 40 organizations participated, although attendance at most meetings was about 30. The procedures for submitting proposals and voting were adopted from those of HPF Forum, which had worked well. One reason the MPI standardization effort succeeded was that the MPI Forum itself was so broadly based. At the original (MPI-1) Forum the parallel computer vendors were represented by Convex, Cray, IBM, Intel, Meiko, nCUBE, NEC, and Thinking Machines. Members of the groups associated with portable software libraries were also there: PVM, p4, Zipcode, Chameleon, PARMACS, TCGMSG, and Express were all represented, as well as some application groups. One subgroup committed to providing a test implementation of each iteration of the standard as it evolved from meeting to meeting; this proved valuable in uncovering the implementation consequences of API decisions, as well as ensuring that when the standard definition was completed, a prototype implementation was immediately available. Marc Snir, Professor of Computer Science at the University of Illinois and an original Forum member representing IBM, has said, “The MPI Forum was an outstanding example of many companies, research labs, and individuals working together to achieve a common good.”
The first version of the MPI standard was published in May 1994. It included standard versions of many well-known message-passing operations such as blocking and nonblocking sends and receives, together with collective operations such as broadcast, reduce, and scan. It broke new ground with its concept of communicators (essential for the modularity of MPI-based libraries), datatypes (to deal efficiently with structured and noncontiguous messages), and process topologies (ignored by many in those days but becoming more significant on today’s machines). Its inclusion of both Fortran and C bindings (with identical semantics) signaled its desire to be immediately useful to both libraries and end-user scientific applications.
MPI also took an innovative approach to the problem of tools for debugging and performance analysis. Rather than designing such a tool into the standard specification itself, MPI provided a mechanism, its “profiling interface,” by which anyone could write a library that intercepted a subset of MPI calls in order to count, measure, or display them in some way, before (and after) passing them to the underlying MPI implementation for actual execution. As expected, this has spawned a wide collection of tools that are completely portable, since the profiling interface is part of the standard rather than the tool itself.
During the 1993-1994 meetings of the MPI Forum, several issues were postponed in order to reach early agreement on a core of message-passing functionality, which nonetheless included several innovative concepts, such as communicators, datatypes, and topologies. The Forum reconvened during 1995-1997 to extend MPI to include remote memory operations, parallel I/O, and dynamic process management, along with a number of features designed to increase the convenience and robustness of MPI. This effort resulted in the MPI-2 standard, released in 1997. MPI-2 had three major new feature sets: an extensive interface to efficiently support parallel file I/O to and from MPI programs; support for one-sided (put/get) communication; and dynamic process management, namely, the ability to create additional processes from a running MPI program and the ability for separately started MPI applications to connect to each other and communicate. MPI-2 also introduced other features, such as precisely defined semantics for multithreaded communication that in some way foreshadowed the multiple modes of OpenMP parallelism, bindings for Fortran-90 and C++, and detailed support for mixed language programming (how to send a message from Fortran and have it received in C, for example).
While the MPI-2 standard was finished in 1997, it took a few years for full implementations to appear. In contrast to the MPI-1 effort, there was no hand-in-hand prototype developed for most of the additions of MPI-2, and in retrospect, some of the useful feedback on the standardization process from a co-developed prototype was missing. Nevertheless, over the next decade and a half, MPI filled the needs of most computational science codes that required a high-performance, scalable, portable programming system. The Forum itself disbanded.
The timing of MPI seems to have been about right. Trying to establish such a standard earlier might have failed to benefit from research into multiple approaches. Indeed, some feared that adoption of a standard would shut down research into the message-passing model. In fact, the opposite happened. Having a fairly complete, performance-enabling, portable interface target stimulated a wealth of research into implementation approaches, tool development, and application algorithms. Much of the research appeared in the Proceedings of the Euro-* conferences, underlining the international nature of MPI-based research. These workshops started as PVM (Parallel Virtual Machine) user group meetings, became EuroPVM workshops from 1994 to 1996, EuroPVM/MPI from 2007 to 2009, and EuroMPI from 2010 to 2017. It is telling and amusing that “Euro”MPI 2017 will be held in Chicago this year.
Over the next fifteen years or so, the MPI Forum itself was inactive, the published standard remained unchanged, and MPI was a stable interface for users and implementers alike. Vendors used the open-source prototype implementations (MPICH, and later OpenMPI), layered to allow optimizations at multiple levels, to evolve their proprietary implementations over time in order to gradually take advantage of their own evolving specialized hardware.
This was no mean feat. As Bill Gropp, Acting Director and Chief scientist at the National Center for Supercomputing Applications, says, “One of the hardest things about an MPI implementation is keeping the implementation focused on the future. This requires finding a balance between making engineering decisions based on today’s hardware and designing and implementing for likely directions in the future.” Many message-passing applications, written in customized ways to deal with the portability problem, switched to making direct MPI calls, improving efficiency and maintainability. And library development was unleashed, fulfilling one of MPI’s original goals. Barry Smith, Senior Computer Scientist at Argonne National Laboratory and primary developer of the PETSc library, explains MPI’s contribution to library development as follows: “MPI changed everything, by providing an extensive API for message passing and collectives that allowed portable distributed memory scientific libraries to no longer need to be programmed to the lowest common denominator of message passing systems. Equally important, MPI eliminated the problem of ‘tag collision’ where each library might utilize the same tags for messages, resulting in messages sent from one library being (improperly) received and processed by a different library or the application code. The MPI communicator concept made distributed parallel scientific libraries practical in two ways, it eliminated the tag collision problem and (by the use of subcommunicators) allowed applications to simply utilize scientific libraries to perform needed computations on subsets of processes, for example with ‘divide and conquer’ algorithms.”
For more than a decade after the Forum disbanded in 1997, the MPI specification remained stable, providing a period during which MPI could “sink in” while implementations steadily improved, parallel libraries flourished, and applications, now portable, took advantage of multiple new tera- and petascale machines, challenging those implementations and libraries to become ever more scalable. However, HPC moves fast, and after a dozen years multiple trends had gradually increased community pressure to restart the MPI process, whose inclusiveness and openness had served the community so well in the past.
For one thing, the scale of massively parallel systems had reached more than a million cores. Single-core processors had disappeared, nodes had become symmetric multiprocessors, and defining how a distributed-memory model like MPI’s would interact with threads (specifically, the emerging OpenMP standard) and shared memory became more critical. Remote memory access (put/get) support in networks became mainstream, raising the applicability of efficient remote memory access (RMA) as a programming model. Although MPI-2’s RMA was used by some applications, it had failed to live up to expectations and needed an overhaul. C and Fortran had both evolved, requiring updates to the MPI interfaces. Nonblocking collective operations had been proposed, and some experience with them obtained. At the time of MPI-2, nonblocking collectives had been considered but deliberately left out of the standard because of the expectation that they could be implemented on top of MPI by issuing blocking operations in separate threads. However, threads turned out to be more difficult to use efficiently, and support for threads was uneven. The increase in scale had brought fault tolerance issues to the fore. And finally, a list of (mostly) minor errata had accumulated.
In response to all this, the MPI Forum reconstituted itself in 2008, at first tidying up MPI-2 and eventually releasing the initial version of MPI-3 in September 2012. Major new features of MPI-3 include the nonblocking collective operations, together with “neighborhood” collectives, useful for stencil computations and relying on the topology functions from MPI-1. (The concept of a nonblocking barrier was considered a joke during the MPI-1 meetings; now MPI has one!) There is an improved one-sided communication interface as well as a tools interface that goes beyond MPI-1’s profiling interface to dynamically access the behavior of an MPI implementation. The Fortran bindings have been updated to take advantage of the Fortran 2008 standard, which was a major step forward in making Fortran work well with libraries in a parallel environment. C bindings were modernized to catch more errors at compile time. Other new features improved interactions with threads and shared memory.
Some topics that the MPI-3 Forum grappled with have not (yet) become part of MPI, such as fault tolerance and more complex support for multithreaded programming, because the Forum decided that current proposals were not quite ready for standardization. The Forum continues to work on these and other issues. Martin Schulz, Computer Scientist at Lawrence Livermore National Laboratory and current chairperson of the MPI-3 Forum, says, “As MPI has established itself as the dominant standard in HPC, it has been exciting and rewarding to see that the members of the MPI forum have not been resting on their laurels. Instead, the Forum continues to drive innovation balanced with the pragmatism necessary for a standards document as we race towards exascale as well as to embrace new commercial application fields and their different requirements.”
Many of the participants in this decades-long effort will speak at the “25 Years of MPI” symposium during the EuroMPI Workshop to be held at Argonne National Laboratory near Chicago on September 25-27, 2017.
About the Authors
Ewing “Rusty” Lusk is Argonne Distinguished Fellow Emeritus at Argonne National Laboratory.
Prof. Jesper Larsson Träff is on the Faculty of Informatics at the Vienna University of Technology.