The TOP500 has provided a ranking of systems for two decades in a consistent fashion, which has provided the high-performance community with a way to compare systems and to establish targets for vendors to deliver increased capabilities to the most challenging applications.
Over the past 20 years, the TOP500 has proven to be a useful and popular benchmark. To a degree, it is a corner point in performance focused on dense linear algebra (compute-intensive floating point), which is highly correlated to many applications in computational science and engineering.
In recent years, new data-intensive problems have come to light that stress the memory subsystem for irregular accesses to data. Complimentary benchmarks are emerging, such as the Graph 500, which evaluates the suitability of a machine’s performance while running data-intensive analytics applications, and the Green500, which provides a ranking of the most energy-efficient supercomputers in the world.
With the upcoming release of the most current rankings, the TOP500 is usually a hot topic of discussion this time of year. I caught up with Professor Hans Meuer recently, considered by many to be the driving force behind the project, to learn more about his thoughts on the TOP500; its past, present, and future.
Tom Tabor: Hans, how timely is this topic at ISC’12 this year?
Hans Meuer: The 39th list will be published on Monday, June 18, during the opening session. That leaves just one more list to compile this year – the November list, which will be released at SC12 – to complete 20 years since the founding TOP500. As we countdown to the 20th anniversary celebration, Erich (Strohmaier), Jack (Dongarra), Horst (Simon), and I will be guests on HPCwire’s Soundbite live from Hamburg, and our ISC Think Tank Series topic will be “The TOP500 – Twenty Years Later.” At SC’12, we’ll also host a TOP500 history booth to demonstrate the 20 years of development of the project. So this will be a very exciting year for us.
Tabor: Take us back to the beginning. How did you, Jack, Horst and Erich meet and did you meet with the intent of starting a ranking?
Meuer: We have all known each other for a long time. Erich joined my staff at the Mannheim University in 1990, and has thus been involved in the TOP500 project from the very beginning. I invited Jack to talk at the second Mannheim Supercomputer Seminar in 1987, and Horst has attended our HPC conferences regularly since 1990. Ironically, we didn’t hold any special meeting when we launched the project in the spring of 1993. Currently, we meet each year at ISC, and in the U.S., at the SC Conference, to discuss the project.
Tabor: How did the idea for the TOP500 germinate?
Meuer: The Mannheim Supercomputer Statistics merely contained the names of the manufacturers and thus became superfluous right at beginning of the 90s. New statistics that reflected the diversification of supercomputers, the enormous performance difference between low-end and high-end models, the increasing availability of massively parallel processing (MPP) systems, and the strong increase in computing power of the high-end models of workstation suppliers (SMP), was more essential.
To provide for this new statistical foundation, in 1993, Erich and I began to assemble and maintain a list of the 500 most powerful computer systems. We also decided right at the beginning to use the best LINPACK performance, Rmax to rank the systems in our list. The first list was compiled in June of that year. Since then, with the help of HPC sites and manufacturers, it has been compiled twice a year.
Erich and I are the TOP500 founding authors, Jack is the father of LINPACK and came aboard in 1993, and Horst embarked on the journey in 2000.
Tabor: Whose idea was it to call it the TOP500?
Meuer: It was my idea and the underlying reasons are two-fold. The first is that when we completed the Mannheim Supercomputing Statistics project, we were left with 530 systems and I considered it logical to begin where we had stopped. The other reason is sentimental. The Forbes 500 list, which point to the world’s richest and most successful people and corporations, has always fascinated me. So, here we are… focusing on the world’s 500 most powerful systems!
Tabor: Did you ever envision the list becoming so mainstream?
Tabor: What was your first instance of the notoriety of the list?
Meuer: Sometime late in the 90s, during one of the sessions at the SC conference, a speaker referred to “the list” in his presentation as a matter of course and not the TOP500 list.
Tabor: On the first list, who was number one and what was the system’s peak performance?
Meuer: This was the Thinking Machines CM-5/1024 at the Los Alamos National Lab, with a best LINPACK performance of 59.7 gigaflops and a peak performance of 131 gigaflops. By the way, the TOP500 app, which is available for free download at the Apple Store contains information on all the past lists.
Tabor: What do you believe are the most important aspects of the TOP500 that have led it to be a widely referenced benchmark?
Meuer: We have been criticized for choosing LINPACK from the very beginning, but now in the 20th year, I believe that it was this particular choice that has contributed to the success of TOP500. Back then and also now, there simply isn’t an appropriate alternative to LINPACK. Any other benchmark would appear similarly specific, but would not be so readily available for all systems in question. One of LINPACK’s advantages is its scalability, in the sense that it has allowed us for the past 19 years to benchmark systems that cover a performance range of more than 11 orders of magnitude. Another significant advantage is that we can foster competition between manufacturers, countries and sites.
The TOP500 list’s success lies in the compilation and analysis of data over time. We have been able to correctly identify and track nearly all HPC developments over 19 years, covering manufacturers and users of HPC systems, architectures, interconnects, processors, operating systems and more. Above all else, the TOP500’s strength is that it has proved to be an exceptionally reliable tool for forecasting developments in performance.
Tabor: If there were no precedent to follow, would you propose ranking supercomputers on the basis of LINPACK measurements today?
Meuer: Yes, because LINPACK remains a useful, valid and substantive benchmark even in the years to come. And there is currently no alternative to replace it.
Tabor: What do you like and dislike with the LINPACK benchmark?
Meuer: The pros of LINPACK as a yardstick of performance are as following: one figure of merit, simple to define and rank, it allows the problem size to change with machine, and over time and it also allows for competition. The cons are that it emphasizes only “peak” CPU speed and number of CPUs. It does not stress local bandwidth, the memory system or the network, and no single figure of merit can reflect the overall performance of an HPC system. To solely rely on LINPACK today and in the years to come is definitely not enough. Additionally, we need other benchmarks to keep track of new HPC systems.
Tabor: Can you please discuss in a bit more detail the current alternative benchmarks?
Meuer: For the purpose of discussion, let’s focus on three alternative benchmarks.
The HPC Challenge Benchmark (HPC CB) from Jack Dongarra basically consists of seven different benchmarks, each stressing a different part of a system. Of course, High Performance LINPACK (HPL) is represented and stands for the CPU. Ultimately, however we don’t have a single number of merit, but seven numbers represented in a much more complex way by the so-called Kiviat Graphs.
For some people, this is too complex to understand, especially for journalists reporting on new systems entering the HPC arena. For system specialists, the results can be well interpreted and for that reason the HPC CB has reached a certain standard for selecting an HPC system for an institution.
The Green500 List, overseen by Wu-chun Feng and Kirk W. Cameron of Virginia Tech is another complimentary approach to ranking supercomputers. The inaugural Green500 list was announced at SC08 as a complement to the TOP500, to provide a ranking of the most energy-efficient supercomputers in the world, so that supercomputers can now be compared by performance-per-watt. At SC11, the latest Green500 list was published with 500 entries. The number one system in the TOP500, Fujitsu’s K computer, reached a remarkable position of number 32 on the green list, although it represents the largest power consumption, with more than 12.5 MW, as observed in the TOP500 list.
The Graph 500, led by Richard C. Murphy from Sandia National Laboratory is a highly important project that addresses the dominating data-intensive supercomputer applications. As current benchmarks don’t provide useful information on the suitability of supercomputing systems for data intensive applications, a new set of benchmarks is needed to guide the design of hardware/software systems intended to support such “big data” applications. While the TOP500 addresses number crunching, the Graph 500 addresses data crunching applications. Graph algorithms are a core part of many analytics workloads. Backed by a steering committee of 50 international experts, Graph 500 will establish a set of large-scale benchmarks for these applications.
The Graph 500 project includes three major application kernels: concurrent search, optimization (single source shortest path), and edge-oriented. (maximal independent set). It addresses five graph-related business areas: cyber security, medical informatics, data enrichment, social networks, and symbolic networks. The Graph 500 was announced at ISC’10, and the first list appeared at SC’10. (9 systems ranked). Further results have been published at ISC’11 (29 systems) and SC’11 (49 systems) with the next list slated for release at ISC’12.
Tabor: Hans, in your opinion, how much of the reason we use the TOP500 is due to the legacy and how much is because it provides good guidance on how fast a computer really is?
Meuer: I have to admit that the TOP500, with LINPACK, is not the best tool for ranking supercomputers but it’s the only one available. The TOP500, with LINPACK, doesn’t tell you how fast a computer is on useful applications. The TOP500 ranks computers only by their ability to solve a set of linear equations, Ax=b, using a dense random matrix A and nothing else. The misinterpretation of the TOP500 results has surely led to a negative attitude towards LINPACK. Politicians, for example, consider a system’s TOP500 rank as a general rank that is valid for all applications, which of course is not true.
Tabor: Do you think the TOP500 should consider replacing its ranking of systems by flops with flops-per-joule?
Tabor: What are your thoughts about expanding the TOP500 to include the price paid for the supercomputer so that one can easily see the price-performance trends?
Meuer: That is a good question. We had thought about this right at the very beginning, but decided not to include any prices. What is the price of a supercomputer? Is it the list price? Is it the negotiated price? That’s a highly vague area, and we were afraid to waste our time with a fly-by-night approach.
Tabor: Do you envision the TOP500 also ranking the performance of cloud computers?
Meuer: We haven’t thought about this yet. When we gain a deeper understanding of cloud computers, we might consider this.
Tabor: Is there any intention to compile all the lists in a book?
Meuer: Yes, we’ve been discussing this since the 15th year of TOP500. We are all more or less very busy, but now that you have reminded me, I’ll start pushing for a discussion in conjunction with our 20th anniversary.
Tabor: Finally Hans, do you believe the TOP500 will still provide a useful measure for ranking systems another 20 years from now?
Meuer: Yes, but I can’t tell you what yardstick we’ll be using 20 years from now.
Tabor: Hans, thank you for taking the time to share this important bit of HPC history with us.
Meuer: With great pleasure Tom… see you in Hamburg.
About the Author
Tom Tabor is CEO and Founder of Tabor Communications, Inc. (TCI), a leading international media, advertising, and communications organization. An industry pioneer, Tom has over 30 years of experience in business-to-business publishing, with the last 24+ years focused primarily on high performance and data-intensive computing technologies.