…and then there are the environmental consequences of our fossil fuel-based economy. Just about every scientist outside the White House (Bush administration) believes climate change is real, is serious, and is accelerated by the continued release of carbon dioxide. If the prospect of melting ice caps, rising sea levels, changing weather patterns, more frequent hurricanes, more violent tornadoes, endless dust storms, decaying forests, dying coral reefs, and increases in respiratory illness and insect-borne diseases – if all that does not constitute a serious threat, I don’t know what does….
Quote from p168, Barack Obama: “The Audacity of Hope.” Published by: Three Rivers Press, USA, 2006
Climate Change is of course global and no respecter of national boundaries. The heatwave in Spain and extreme flooding in the UK in 2008 are but two examples of extreme weather in Europe.
For HPC vendors the earth sciences segment provides a great business opportunity across the globe. Both improved predictability of severe weather events and climate change assessments for policymakers are high on national governments’ agendas. As the saying goes: “Every cloud has a silver lining.”
I caught up with Per Nyberg, director of marketing and business development for earth sciences at Cray Inc., following the 11th International Specialist Meeting on the Next Generation Models on Climate Change and Sustainability for Advanced High-Performance Computing Facilities held at Oak Ridge National Lab (ORNL).
ORNL is no stranger to being in the vanguard of HPC facilities. In the last few years ORNL has implemented one segment of the high productivity petaflops initiative spearheaded by DARPA funding. Its choice was a Cray XT product line system and its latest upgrade to the Cray XT5 system, named “Jaguar,” increased the system’s computing power to a peak of 1.64 petaflops, making Jaguar the world’s first petaflops system dedicated to open research.
Indeed, an ORNL research team recorded an unprecedented 1.35 petaflops sustained performance when running a superconductivity application used in nanotechnology and materials science research. The team’s simulation ran on over 150,000 of Jaguar’s 180,000-plus processing cores. The latest simulations on Jaguar were the first in which the team had enough computing power to move beyond ideal, perfectly-ordered materials to the imperfect materials that typify the real world.
The petaflops barrier was broken on a second application, with 1.05 petaflops of sustained performance. The new performance levels for this application, a first principles material science computer model used to perform studies involving the interactions between a large number of atoms, are expected to support advancements in magnetic storage.
One swallow may not signify summer, but two swallows hint of things to come.
Christopher Lazou: Per, it’s good that you can spare some time to talk to me. Let’s briefly explore some of the work done at Cray’s current customer base, discuss what products Cray has to offer earth sciences and in the process try to gain insight into your views concerning the benefits of using Cray systems in this important field. Tell me about Cray’s history in this area and where this community fits into Cray’s future.
Per Nyberg: Cray has a rich history of long and successful relationships with the weather, climate and oceanographic communities, and is committed to providing the best possible solutions to this marketplace. It is a key business area for Cray and the demands of complex, computationally-intense earth system models factor heavily into our ongoing research and development programmes.
The needs of scientists studying the earth’s system are consistently cited in both defining and justifying the need for sustained petaflops computing. A significant percentage of Cray’s revenue is invested in research and development, and the needs of this community play a central role in defining our future products and technologies.
Lazou: Who are Cray’s significant customers in the earth sciences?
Nyberg: This is a key application area that spans nearly the entire spectrum of our customer base, ranging from those whose core business is earth system modelling to multi-disciplinary HPC centres.
In recent years Cray has sold large systems to national meteorological and hydrological services in countries such as Switzerland, Spain, India and South Korea. Our two most recent installations are Cray XT5 systems at the Danish Meteorological Institute (DMI) and the U.S. Naval Oceanographic Office.
As an example, MeteoSwiss uses a Cray XT4 located at the Swiss Centre for Scientific Computing (CSCS) for their operational requirements. With this capability, MeteoSwiss has been able to implement one of the highest resolution regional models in Europe, a key requirement for accurate forecasts in their challenging mountainous terrain.
Nearly every large scientific HPC site in government and academia uses some of their computing resources for earth system modelling. Examples of these include the HECToR system at Edinburgh Parallel Computing Centre (EPCC) in the UK, the Bergen Centre for Computational Science (BCCS) in Norway, the Centre for Scientific Computing (CSC) in Finland, the University of Tennessee / National Science Foundation, the National Energy Research Scientific Computing Center (NERSC) and Oak Ridge National Laboratory (ORNL).
The recent 11th International Specialist Meeting on the Next Generation Models on Climate Change and Sustainability for Advanced High-Performance Computing Facilities was very aptly held at ORNL. The Cray petaflops system at ORNL, “Jaguar,” is the only open science petaflops system in the world and the first such system available to the climate community. This is obviously a milestone in high performance computing in general, but also specifically for the climate community, which has been reiterating the importance of such systems for many years.
Lazou: What examples can you cite of ground-breaking science that is being done at these centres?
Nyberg: Two examples that come to mind are the Climate Science Computational End Station usage of the Cray XT systems at ORNL and NERSC, and the U.S. National Oceanic and Atmospheric Administration (NOAA) Hazardous Weather Testbed Spring 2008 experiment conducted on the Cray XT3 at the Pittsburgh Supercomputing Center (PSC).
In preparation for the fifth IPCC assessment, the U.S. Department of Energy, National Science Foundation, National Aeronautics and Space Administration and university researchers have partnered in a Climate Science Computational End Station Development and Grand Challenge Team. The aim is achieving unprecedented simulations and coordinated model development on the next-generation climate model. With millions of hours of access to the Cray systems at ORNL and 380+ teraflop Cray XT4 at NERSC, IPCC researchers will be able to apply greater computational resources to climate problems than ever before. This is a ground breaking capability.
This past spring the University of Oklahoma’s Center for Analysis and Prediction Storms (CAPS) used the Cray XT3 system at PSC to incorporate real-time radar data into their high-resolution thunderstorm forecasting model for the first time. Observational data from more than 120 weather radars enabling the most realistic storm prediction to date. This was part of the annual NOAA Hazardous Weather Testbed Spring Experiment and was a key step in their ability to predict storms more accurately and with improved lead time.
We are now seeing real applications scaling on 10,000’s of cores at our largest customers and enabling simulations not previously possible. One example from ORNL is a 5 km semi-hemispheric run of the WRF (Weather Research and Forecasting) model on 150,000 cores sustaining over 50 teraflops per second. This level of scaling and sustained performance has never been seen before on such an application.
Lazou: You mentioned a recent Cray XT5 installation at DMI. Can you tell me a little about the system that has been installed?
Nyberg: We are very excited about the installation at DMI. In fact the DMI “system” is composed of two identical Cray XT5s. Like many numerical weather prediction centres today, the DMI design was to have two identical systems, one for operations and one for a dual research and failover role. The two systems are integrated through a single global file-system where the two Cray XT5 systems are both clients on a shared Lustre global file-system. This provides maximum flexibility and resiliency, while maintaining the highest levels of performance.
Of course it is the performance of the HIRLAM weather model on the XT5 systems that is crucial, but the reality today is that system performance is just one dimension of the buying decision. The overall environment needs to meet the centre’s objectives for on-time delivery of meteorological products and cost-of-ownership criteria including electrical consumption, system utilization and management.
Lazou: Beyond the role of HPC supplier, how is Cray involved in this community?
Nyberg: A driving mission for Cray is that it helps customers solve their most challenging computational science problems. Cray has been and continues to be engaged in a number of activities which support advanced science through achieving greater performance with earth system models. Efforts range from working directly with application developers, to involvement in community efforts and fostering the greater use of HPC in academia. These engagements are often done through our Centers of Excellence such as those at HECToR and ORNL. An example of a more extensive partnership is the Earth System Research Centre (ESRC) which was jointly established by the Korean Meteorological Administration (KMA) and Cray to advance the science of earth-system modelling over the East-Asia Pacific region. The third round of ESRC sponsored projects was just recently announced.
Lazou: You mentioned the Cray “petascale” system installed at ORNL. The requirement for petaflops computing has been a stated objective by the climate community for some time, and there continues to be efforts worldwide to secure access to this capability. From a general perspective, can you comment on the challenges involved in petaflops computing?
Nyberg: Securing the highest possible performance capabilities has always been a key requirement in advancing the state of climate science. This was a clear message at the World Modelling Summit for Climate Prediction earlier this summer and in the recommendations of U.S. weather and climate leaders in August to greatly increase computing power available to the weather and climate community. It is also important to note that computing power was just one of the areas addressed by these recommendations.
From a computing perspective, the successful realization of a sustained petaflops will depend on the convergence of multiple disciplines and stakeholders. Let’s be realistic about the scale and complexity of these systems. There has always been a tendency to over-simplify on one metric. Peak flops being the most obvious one. The reality, however, is that sustaining a petaflops will require all system aspects to be petascale. Application software, system software, system I/O, external peripherals, scheduling, RAS, management and so on, are all on the critical path. Even in the case of using a system for a throughput oriented workload, such as a many member ensemble, the resulting I/O and scheduling challenges remain petascale.
Lazou: With global warming upon us, and energy security high on every national government’s agenda, how energy efficient are Cray systems?
Nyberg: The issues of energy costs and efficient power usage are foremost in the minds of HPC centres worldwide. Modern HPC demands are calling for computer systems to do more computing in less space and Moore’s law has kept processing power up to this demand. However, this means individual compute racks are currently pushing 40 kW, or more, and there is a need to rapidly adapt cooling at both the facility and rack level.
Cray has been a leader in power and cooling technologies, including liquid cooling, since the Cray-1 in 1976. With the drive towards ever increasing system sizes, we are concerned with addressing all the requirements that will ultimately define their success and usability.
We recently announced a novel, non-invasive approach to heat removal that brings the refrigeration to the cabinet, transferring heat with a patented “flooded coil” cycle. This technology, termed ECOphlex (Phase-change Liquid Exchange), is designed to be “room air neutral,” meaning that the temperature of the air entering the system is roughly the same as the temperature of the air exiting the system. In a recent test at a government site, ECOphlex technology removed 100 percent of the heat.
ECOphlex uses efficient air flow to remove heat from the base components, and a phase-change refrigerant system to remove heat from the air prior to leaving the cabinet. The technology’s phase-change coil is more than 10 times as efficient at removing heat from the compute cabinets as a water coil of similar size. There is also the flexibility to use chilled or un-chilled water at various temperatures. This promotes energy savings by enabling greater system density, reducing the need for expensive air cooling and air conditioners, and limiting the need for chilled water.
Lazou: I think we explored a fair number of issues. Thank you, Per, for your time and frank answers. I am sure our readers will find your views very interesting.
—–
Note: For those interested in the development of meteorology in the last fifty years as seen through the eyes of a tireless worker from NCAR, and laced with a human touch, I recommend the excellent autobiographical book, “Odyssey in climate modelling, global warming and advising five presidents,” by Dr Warren M. Washington, edited by his wife Mary and published by Lulu (http://www.lulu.com).
The ISC’09, to be held in Hamburg, Germany, June 23-26, 2009, is organizing a special in-depth session on earth sciences on Tuesday, June 23, featuring four hours of detailed presentations and discussions. In addition, Hans Meuer is planning a great party in Hamburg with all the HPC vendors and practitioners in attendance, so try to get there — do not miss out.
Copyright (c) Christopher Lazou. April 2009. Brands and names are the property of their respective owners.