May 26, 2006

Sending out an SOS: HPCC Rescue Coming

By Christopher Lazou

The SOS Forum series was founded in 1997 under the initiative of people interested in High Performance Cluster Computing at the Sandia National Lab and Oak Ridge National Lab as well as EPFL in Switzerland.

“I don't know where we are going, but we'll get there quicker if we get started.” —  David Bernholdt-ORNL (SOS9, March 2005).

SOS is the recognized international distress call for help was it's quite apt for “capability” computing in the 1990s, especially in the U.S. Come the new century, and thanks to some help from new R&D funds for high productivity systems, IBM, Sun Microsystems Inc. and Cray Inc. are working hard to offer a rescue pathway.

The SOS Forum series was founded in 1997 under the initiative of people interested in High Performance Cluster Computing (HPCC) at the Sandia National Lab and Oak Ridge National Lab as well as EPFL in Switzerland. (EPLF is the Swiss cradle for the successful design and implementation of Beowulf systems). SOS stands for “Sandia, Oak Ridge, Switzerland.” In 1997, was the major centers were starting to explore the capacity of communication systems for building their own HPC clusters. (Note: At this time, Quadrics and Myrinet did not have commercial products.)

The SOS Forums take place annually in the spring and are open to anyone interested in discussing new and visionary ideas on HPCC, but the number of participants is deliberately kept low (not more than 50). The ninth SOS workshop took place in Davos, Switzerland, last March. For further details visit the SOS website: htpp://

The thrust of the SOS Forum is to foster multi-laboratory, multi-national collaboration to explore the use of new parallel supercomputer architectures, such as clusters with commodity-based components, heterogeneous and web supercomputing etc., and is not focused on any particular system.

The theme of the ninth SOS Forum was Science and Supercomputers. The perceived wisdom is that “Today science is enabled by supercomputing, but tomorrow science breakthroughs will be driven by supercomputers.” The workshop explored what is needed to prepare for an age when manipulating huge data sets and simulating complex physical phenomena is used routinely to predict and explain new scientific phenomena.

The questions addressed at SOS9 were:

  • What are the computational characteristics needed to facilitate this transition?
  • How can the existing and emerging supercomputer architectures be directed to help science?
  • Is there a need for new facility models that cater to large science or is the traditional supercomputer center with thousands of users sufficient for the future?
  • What software and programming models are being explored to make it easier for scientists to utilize the full potential of supercomputers?

The SOS9 Forum was a tour de force of personalities from the U.S. and Europe discussing world-class activities at their sites and furnishing some insights on how future HPC products can effectively serve their scientific community and the needs of science at national level. These sites have heterogeneous environments and using systems from several major vendors.

Sites such as CSCS, Switzerland, HPCx facility, UK, ORNL in Oak Ridge are on a development path, which will define capability scientific computing for at least the next decade. The trend is for setting up partnerships between centers and computer vendors as well as collaborations with centers of excellence across national boundaries. A good example is the partnership between Sandia and Cray developing the $90 million Red Storm system. Bill Camp's motto is “Use high- volume commodity components almost everywhere, but when necessary for scalability, performance and reliability use custom development.”

The engineering task was how to deliver 40 Tflops per second peak performance using 10,000 AMD Opteron chips and a specially designed high bandwidth, low latency interconnect. Red Storm is already a great success, has since been made into a Cray product and is marketed as the Cray XT3.

According to Bill Camp, “Red Storm is achieving its promise of being a highly-balanced and scalable HPC platform with a favorable cost of ownership. It is setting new high water marks in running key national security and science applications at Sandia and elsewhere.”

In March, CSCS, the national Swiss leadership computer center, bought a large Cray XT3 system, as the first phase of its procurement cycle. CSCS has laid plans to team with leading U.S.-based supercomputing sites, the Pittsburgh Supercomputing Center, Oak Ridge National Laboratory and Sandia National Laboratories, to fine-tune the software environment and make the Cray XT3 technology mature for a broad spectrum of scientific production work.

According to Dr. Marie-Christine Sawley, CSCS CEO, “The Cray XT3 was bought as a highly scalable capability system for very demanding, high-end computational scientific and engineering research applications. The system is designed to support a broad range of applications and positions CSCS, as a leadership-class computing resource supplier for the research community of Switzerland. It also positions it for attracting highly visible, value added international collaborations.”

Sawley explained in her SOS presentation that their systems prior to the Cray XT3 include an IBM SP4 system where over 60 percent was used for chemistry codes and an NEC SX-5 high memory bandwidth vector system, of which 44 percent is used for meteorology/climate applications. The CSCS still offers services on both the SX- 5 and the SP4 systems; the XT3 represents an extension of its computing capacities toward true MPP. The phase two procurement this autumn is looking at providing suitable upgraded computing resources for the SX-5 user community requiring high memory bandwidth capability computing.

CSCS is working at establishing collaboration, including a visitor program with centers having Cray XT3 systems, on porting applications, system tuning and tools. CSCS is offering applications in chemistry, molecular dynamics, environment, material science and physics, from its core competences and customer portfolio. Tools, such as performance monitoring, debuggers and visualization, are also part of the focus or interest.

The keynote by Dr. Thomas Zacharia, Computing and Computational Sciences (ORNL, Associate Lab Director), titled: “A new way to do science: Leadership Class Computing at ORNL facilities,” typifies what these centers are likely to be developing into. ORNL was awarded funding by the DoE to address the opportunities and challenges of Leadership computing. This involves in part developing and evaluating emerging — but unproven — experimental computer systems. Their brief is to focus on Grand Challenge scientific applications and computing infrastructure — driven by applications. The goal of Leadership systems is to deliver computational capability that it is at least 100 times greater than what is currently available. It is acknowledged by funding bodies that Leadership systems are expensive, typically costing about $100 million a year.

It is now recognized that a focused effort is critical in order to harness the experimental potential of computing and translate it into breakthroughs in science. The infrastructure needed consists of capability platforms with ultra- scale hardware as well as software and libraries to efficiently exploit them, teams of hardware and software engineers and, most importantly, funding for seamless access, by research teams of scientists investigating Grand Challenge problems.

With DoE funding, ORNL recently set up the National Leadership Computing Facility. In the computing platform area, NLCF is concentrating on developing and proving several Cray purpose-built architectures, optimized for specific classes of applications.

NLCF has recently installed 1024 processors, aggregate of 18.5Tflops per second peak performance Cray X1E — the largest Cray vector system in the world. The Cray X1E has proven vector architecture for high performance and reliability, very powerful processors and very fast interconnection subsystem. It is scalable, has globally addressable memory with high bandwidth and offers capability computing for key applications. This system has been allocated to five high-priority Office of Science applications as follows:

  • 3D studies of stationary accretion shock instabilities in core collapse supernovae (415,000 processor hours).
  • Turbulent premix combustion in thin reaction zones (360,000 processor hours).
  • Full configuration interaction benchmarks for open shell systems (220,000 processor hours).
  • Computational design of the low-loss accelerating cavity for the ILC (200,000 processor hours).
  • Advance simulations of plasma micro-turbulence (50,000 processor hours).

Another platform just installed is the 5,212 AMD Opteron processors Cray XT3 system with aggregate peak performance of 25.1 Tflops per second. It has extremely low latency, high bandwidth interconnect, efficient scalar processors and balanced interconnect between processors providing capability computing. Although the Cray XT3 is new, its architecture is proven as is based on ASCI Red. It uses the Linux operating system on service processors and a specially adapted micro- kernel, for optimal performance on compute processors. According to Zacharia, benchmarks show this system is No. 1 in the world on four of the HPC Challenge tests and No. 3 in the world on the fifth.

To give a feel of the power of this system, in August 2005, just weeks after the delivery of the final cabinets of the Cray XT3, researchers at the National Center for Computational Sciences ran the largest ever simulation of plasma behavior in a Tokamak, the core of the multinational fusion reactor ITER.

The code, AORSA used for ITER, solves Maxwell's equations — describing behavior of electric and magnetic fields and interaction with matter — for hot plasma in Tokamak geometry (i.e., the velocity distribution function for ions heated by radio frequency waves in Tokamak plasma). The largest run by ORNL researcher Fred Jaeger utilized 3,072 processors — roughly 60 percent of the entire Cray XT3. The Cray XT3 run improved total wall time by more than a factor of three over its IBM P3 system.

The importance of this improved performance cannot be overstated. For decades, researchers have sought to reproduce the power of the sun, which is generated by fusion of small atoms under extremely high temperatures — millions of degrees Celsius. The U.S., Europe and other nations have joined forces to develop the multi-billion dollar International Thermonuclear Experimental Reactor. ITER's donut-shaped reactor uses magnetic fields to contain a rolling maelstrom of plasma, or gaseous particles, which comprise the “fuel” for the fusion reaction.

Cost-effective and efficient development and operation of ITER depend on the ability to understand and control the behavior of this plasma: its physics and optimal conditions that foster fusion. Harnessing fusion for future “clean” energy will have worldwide environmental ramifications.

NLCF expects to deploy a 100 Tflops per second Cray XT3 in 2006, followed by a 250 Tflops future Cray Rainier system in 2007 or 2008. Rainier is a unified product incorporating vector, scalar and potentially re-configurable and multi-threaded processors in a tightly connected system. This heterogeneous architecture offers a single system solution for diverse applications workloads.

The NLCF is built as a world-class facility. It consists of 40,000 square foot computer room and an 8 Mwatts power supply. It contains additional classrooms and training area for users, a high ceiling area for visualization (cave, power-wall, access Grid etc.) and separate laboratory areas for computer science and network research.

Using high bandwidth connectivity via major science networks, NSF TeraGrid, Ultranet and “Futurenet,” NLCF aims to integrate core capabilities and deliver computing for “frontiers” science. The program includes joint work with computer vendors to develop and evaluate next-generation computer architecture (e.g. Cray systems and IBM Blue Gene/L), create math and computer science methods to enable use of resources, (e.g., SciDAC, ISIC), nurture scientific applications partnerships and fund modelling and simulation expertise. The ultimate goal is to transform scientific discovery in the fields of biology, climate, fusion, materials, industry and other governmental agencies through advanced computing.

Instruments for international collaboration are also important for shortening time to solution and enhancing the potential for scientific breakthroughs. The ORNL program for Leadership computing includes collaborations with other large-scale computing centres, e.g. Sandia, PSC and CSCS.

As Zacharia said: “ORNL has a long standing partnership with Sandia and CSCS on many fronts; collaborations in applications areas, collaborations in enabling technologies, sharing of best practices in managing and operating our respective centers and, of course, our historical partnership in the SOS series of Forums.”

The NLCF is primed for active dialogue with academia, industry, laboratories and other HPC centers. The joint institute for computational sciences is to be a state-of-the-art distance-learning facility. It aims to provide incubator suites, joint facility offices, conference facilities and strong student and post-doctoral programs. It supports educational outreach through research alliances in math and science programs and industrial outreach through a computational center for industrial innovation. It also supports international collaborations in computational sciences by hosting guest scientists and visiting scholars.

Another speaker, Dr. Paul Durham, CCLRC Daresbury Laboratory, described capability computing on HPCx, an IBM Power4 based system, used as national resource for UK research. After giving many examples of scientific results, he said the project to move user consortia onto capability computing — defined as needing more than 1,000 processors — as follows: “Research done on HPCx is driven by specific scientific goals, set out in the peer reviewed grant applications. Some users are obtaining excellent results running on 128 to 256 processors. There may be no scientific case for moving these into the capability regime. The intention for the HPCx facility was that resources should only be granted to consortia with true capability requirements.” 

Durham concluded by asking a series of questions. The computational research community identified many fascinating and important Petascale problems, but has it achieved enough capability usage at Terascale? What are the best capability metrics? Do they have to be hardware based? Can capability science be defined? How many projects can be sustained before the capability mission gets diluted? Are there enough “capability” users with Petascale ambitions coming through? Are they in new fields or the usual suspects? Can we expect new fields for ‘capability' computing to arise spontaneously, or should we lead them to it?

Michele Parrinello, a professor from computational science ETH Zurich, gave a keynote presentation titled “The challenges of scientific computing.” He described many interesting scientific results in chemistry and molecular dynamics. He asked the rhetorical question: Why do simulations? His reply was to interpret experimental results, replace costly, or impossible experiments, gain insights and possibly predict new properties (e.g. virtual microscopy).

Another question is whether one can use molecular dynamics to explore long time scale phenomena? The answer: no, presently. Direct simulation allows only very short runs of ~10ps for ab-Initio MD and ~10ns for classical MD. Many relevant phenomena need longer time scale: chemical reactions, diffusion, nucleation, phase transition, protein folding and so on. 

Another presentation, from Professor Andreas Adelmann of the Swiss Paul Scherrer Institut, described research and “HPC demands in computational accelerator physics.” He briefly presented particle accelerators, how they are modelled with working examples and elaborated on next-generation particle accelerators, the High Energy — LHC, the High Intensity – -Spatial Neutron Source, the high brilliance light source and their modelling needs. This was illustrated by several examples, such as Particle In Cell simulations, using low dimension Vlasov solver, for relativistic electrodynamics, including collisions and so on.

His conclusion was that HPC hardware needed to consist of a large number of tightly coupled CPUs with access to low latency high bandwidth memory, especially for the large 3D n-body problem (in space and time) and for the fine grid 4(6)D, Vlasov solver. Fast I/O is essential as post processing is a parallel data mining activity. The software requirements are for efficient numerical implementations of FFT, MG and AMR, load balancing fault tolerant systems and algorithms. The Paul Scherrer Institut and, in particular, the particle accelerator project, headed by Dr. Adelmann, was pivotal for the participation of PSI inside the Horizon project, culminating in the recent purchase of the Cray XT3 system by CSCS.

In a panel titled “How we as a community can try to get a richer and more uniform programming environment across the variety of high-end platforms?” participants were Thomas Sterling (CACR CALTECH), David Bernholdt (ORNL), Pierre Kuonen (EIF) and Rolf Riesen (Sandia). Bernholdt discussed a uniform environment for high user productivity and the rapid creation of correct and efficient application programs.

He explained the different requirements for applications and algorithms, namely, high-level specification and low-level control. There is a trade-off in delivering generality, abstraction and scalability. There are also proposals to develop “polyglot” programming as described in a talk by Gary Kumfert (LLNL) at a workshop on high productivity languages and programming models (May 2004).

The requirements for these endeavors to succeed are “Legacy codes must be supported; traditional and new programming languages, traditional and new programming models, must be able to interoperate. Some language and model constructs are incommensurate, but for most some useful specification for interoperability can be established. It was suggested that BABEL should be adopted as the language interoperability vehicle for HPC, as it provides a unified approach in which all languages are considered as peers. It can act as the bridge for C, C++, Java, F77, F90, F2003, Python, etc. It is essential that language interoperability is build into standards. For example, F2003 provides interoperability with C. When designing and implementing new languages it is advisable to assume they are to be used in a mixed environment”.

Interoperability of programming models, presently need a lot of work in developing an abstract specification and overcoming practical obstacles in implementing them.

According to Bernholdt, productivity on diverse architectures is achievable using abstraction, vertical integration across the software stack and helpful hardware. Interoperability is also achievable in programming languages by using BABEL and standards, but this is much harder for programming models. A uniform programming environment is undesirable, as users need choices, not uniformity.

Computing has experienced exponential growth in the last 30 years and this is expected to continue. Yet the HPC user community has long been promised Terascale computing by forecasters carried away with new technology, but as Durham pointed out, is just about conquering Terascale problems, so to scale up to Petascale is an enormous task. Now that the industry is building heterogeneous computers, attempting to match hardware to application needs (e.g., the cascade approach described in an article by the High-End Crusader HPCwire, 8-12- 05), the problems for Petascale computing look more tractable. Only time will tell, whether the user community will be able to utilize these systems by 2010.

One of the greatest challenges for achieving 2010's target is delivering infrastructure for sustained performance. Technical challenges include chip densities and heat dissipation, power consumption and footprint at the component level as well as the memory wall (bandwidth, latency and connectivity) for harnessing tens of thousands of CPUs to handle large-scale simulations. The National Leadership Computing Facilities being set up at ORNL, at PSC, at CSCS and in the UK, etc., are extending the large-scale scientific computing frontiers.

Copyright: Christopher Lazou, HiPerCom Consultants, Ltd., UK. August 2005

Share This