by Martin Meuer, Prometeus GmbH
Jack Dongarra, director of the Center for Information Technology Research (CITR) and of the Innovative Computing Laboratory (ICL) at the University of Tennessee, is a legend in the supercomputing field since twenty years: LINPACK, MPI and the TOP500 are closely related to him.
Jack Dongarra will give the keynote presentation on Friday, June 21, 2002 at the 17th International Supercomputer Conference in Heidelberg, Germany. The title of his talk is: “High Performance Computing, Computational Grid, and Numerical Libraries”.
This interview for HPCwire was conducted by Martin Meuer, Prometeus GmbH, Germany.
HPCwire: Jack, you belong to the most renowned benchmark and HPC experts in the world. How long have you been personally involved in benchmarking and in the field of HPC?
DONGARRA: The original LINPACK Benchmark is, in some sense, an accident. So I guess that makes me an “Accidental Benchmarker”. It was originally designed to assist users of the LINPACK numerical software package by providing information on execution times required to solve a system of linear equations. The first “LINPACK Benchmark” report appeared as an appendix in the LINPACK Users’ Guide in 1979. The appendix of the Users’ Guide collected performance data for one commonly used path in the LINPACK software package. Results were provided for a matrix problem of size 100, on a collection of widely used computers (23 computers in all). This was done so users could estimate the time required to solve their matrix problem by extrapolation.
Over the years additional performance data was added, more as a hobby than anything else, and today the collection includes around 1500 different computer systems. In addition to the number of computers increasing, the scope of the benchmark has also expanded. The benchmark report describes the performance for solving a general dense matrix problem Ax=b at three levels of problem size and optimization opportunity: 100 by 100 problem (inner loop optimization), 1000 by 1000 problem (three loop optimization – the whole program), and a scalable parallel problem.
HPCwire: The Linpack benchmark has been used up until now to determine the TOP500-lists, of which you are a co-publisher. What are the strengths and weaknesses of Linpack for the evaluation of supercomputers?
DONGARRA: In order to fully exploit the increasing computational power of highly parallel computers, the application software must be scalable, that is, able to take advantage of larger machine configurations to solve larger problems with the same efficiency. The LINPACK benchmark addressed scalability by an introduction of a new category in 1991. This new category is referred to as a Highly-Parallel LINPACK (HPL) NxN benchmark. It requires solution of systems of linear equations by some method. The problem size is allowed to vary, and the best floating-point execution rate should be reported. In computing the execution rate, the number of operations should be 2n3/3+2n2 independent of the actual method used. If Gaussian elimination is chosen, partial pivoting must be used. The accuracy of solution is measured and reported. The following quantities of the benchmark are reported in the TOP500 list:
∑ Rmax is the performance in GF/s for the largest problem run on a computer, ∑ Nmax is the size of the largest problem run on a computer, ∑ N1/2 is the size where half the Rmax execution rate is achieved, ∑ Rpeak is the theoretical peak performance in GF/s for the computer.
As such this benchmark reports the performance of this one application with its floating point operations and message passing. So there is one application and one number that represents the performance for the computer system. The weakness is that this is just one application and one number.
HPCwire: How would you rate the “IDC balanced rating HPC benchmark”, which is considering other parameters like bandwidth/latency of the memory and the scalability of the system? Is it a more realistic approach than Linpack and therefore capable of replacing Linpack for the evaluation of supercomputers?
DONGARRA: IDC Benchmark looks on the surface to be a good starting point for a follow on benchmark. Unfortunately because of the way the rating procedure is implemented there are problems. Others have gone into detail on the problems.
As Aad van der Steen said in a recent article in the Primeur Weekly of February 13, 2002 when asked: How informative is the IDC Balanced Rating HPC Benchmark? He said “It is not. It looks like all the time and effort invested in the HPC Forum has yielded a sub-standard product that should be radically improved or withdrawn.” Presently it only adds to the confusion that already exists on the HPC Benchmark scene.
HPCwire: Are the TOP500 authors searching for an alternative to Linpack as yardstick for such evaluations? Or will we be forced to remain satisfied with Linpack for the expected PF/s systems within the next 10 years?
DONGARRA: Yes, the organizers of the Top500 are actively looking to expand the scope of the benchmark reporting. It is important to include more performance characteristic and signatures for a given system. There are a number of alternatives to look at in expanding the effort such as, STREAMS benchmark, the EuroBen-DM benchmark, the NAS Parallel Benchmarks, and PARKBENCH. Each of these describes more about the scalability of systems and provides more insight than the IDC benchmark will.
HPCwire: In the TOP500-list, which will be published in June during the ISC2002 in Heidelberg, the Japanese Earth Simulator (ES) will replace the American ASCI White as the new #1, with a best Linpack Performance of 35.61 TF/s. How large was the dense system of linear equations relating to this, which ES solved for the new Linpack world record, and how long did it take for ES to solve it?
DONGARRA: The system of linear equations was of size 1,041,216; (8.7 TB of memory). This is the largest dense system of linear equations I have seen solved on a computer. The benchmark took 5.8 hours to run. The results of the computation were checked and were accurate to floating point arithmetic specifications. The algorithm used to solve the system was a standard LU decomposition with partial pivoting and the software environment, for the most part, was FORTRAN using MPI with special coding for the computational kernels.
HPCwire: You yourself have referred to ES in the New York Times as Computenik, alluding to the Russian Sputnik in 1957, which certainly motivated and accelerated the US space program decidedly. Was the rapid completion of ES really such a great surprise in the USA? As far as I know, the Japanese ES- Project, including the time schedule, was known worldwide from the beginning. Or was the scheduled completion of ES with an efficiency of 87% considered too optimistic?
DONGARRA: I think the time schedule was known, but the fact that it was completed on time and is up in running is impressive. The result achieved with 87% on the benchmark is also very impressive. The impressive features of the performance are that it is 5 times the performance of the ASCI White Pacific computer for the benchmark. This is the largest difference we have seen in the past 10 years.
Here are some additional impressive statistics of the Earth Simulator:
The performance of ES is approximately a fourth of the performance of all the TOP500 computers from the November 2001 list, greater than the performance of all the DOE computers together and greater than the sum of the Top 20 Computers in the US.
HPCwire: Do you think that ES will accelerate the ASCI-Program?
DONGARRA: People in the US are taking notice of the Japanese accomplishments in the Earth Simulator Computer and it is the hope of many of us that the US government will continue to invest in the future of high performance computing.
HPCwire: What does the ES mean for “vector parallel multiprocessing architectures”; will they gain terrain? Will not only NEC, but also Cray Inc. with the SV2, profit from ES?
DONGARRA: I’m not sure about the Cray SV2, but the NEC SX architecture is very impressive as a highly parallel system as composed in the Earth Simulator. I would guess there are a number of customers around the world who are very interested in acquiring such system composed of the SX nodes and the Earth Simulator’s switch fabric.
HPCwire: Currently grid computing is of major interest. For example, at the ISC2002 in Heidelberg in a tutorial, a whole day will be dedicated to this topic, and in the conference an entire session, in which you will supply the Friday keynote address of ISC2002 relating to the topic “High Performance Computing, Computational Grids, and Numerical Libraries”.
What are your current projects concerning grid computing?
DONGARRA: At the University of Tennessee we are working on three Grid related projects, NetSolve, GrADS, and Harness.
Since 1995 we have been working on an approach to grid computing called NetSolve. NetSolve provides easy access to computational resources that are distributed with respect to both geography and ownership. Using NetSolve, a user can access both hardware and software computational resources distributed across a network.
In addition we are working with a group of colleagues on a NSF funded program execution framework being developed by the Grid Application Development Software (GrADS) Project. The goal of this framework is to provide good resource allocation for Grid applications and to support adaptive reallocation if performance degrades because of changes in the availability of Grid resources.
HARNESS (Heterogeneous Adaptable Reconfigurable Networked SystemS) is an experimental metacomputing framework built around the services of a highly customizable and reconfigurable distributed virtual machine (DVM). A DVM is a tightly coupled computation and resource grid that provides a flexible environment to manage and coordinate parallel application execution. This is a collaboration of researchers at the University of Tennessee, Emory University, and Oak Ridge National Laboratory and funded by the Department of Energy.
HPCwire: How would you predict grid computing to affect the landscape of HPC in the future?
DONGARRA: The Grid will make it possible to implement dramatically new classes of applications. These applications ranging from new systems for scientific inquiry, through computing support for crisis management, to support for personal lifestyle management are characterized by three dominant themes: computing resources are no longer localized, but distributed and hence heterogeneous and dynamic; computation is increasingly sophisticated and multidisciplinary; and computation is integrated into our daily lives, and hence subject to stricter time constraints than at present.
A good resource on activity in the area is “The Grid: Blueprint for a New Computing Infrastructure”, edited by Globus originators Ian Foster and Carl Kesselman (Foster and Kesselman, 1998).
HPCwire: There are many non-commercial projects in the area of grid computing, for example SETI@home. When can we expect the first commercial applications?
DONGARRA: There are examples today. Companies like United Devices, Entropia, Avaki, Parabon Computation are trying to establish a viable commercial operation using the concepts and ideas that came from the SETI@home experience.
Web site: http://www.supercomp.de
============================================================