Europe’s Fastest Supercomputer to Get Pascal GPU Upgrade

By Tiffany Trader and John Russell

April 6, 2016

Already Europe’s fastest supercomputer at 7.8 petaflops, the Piz Daint (hybrid CPU/GPU Cray XC30) at the Swiss National Computing Center (CSCS) will double its performance with a massive upgrade that involves switching to NVIDIA’s newest Pascal GPU architecture and merging with Piz Dora (Cray XC40), a smaller CPU-based machine. The announcement was made at GTC16 yesterday. Last November Piz Daint placed seventh on the TOP500 list.

Plans call for 5,200 NVIDIA K20xs to be replaced by 4,500 Pascal GPUs – which version hasn’t been decided. Also, the Intel processors will be upgraded from Sandy Bridge to Haswell architecture. When completed, the new combined system, all on a single fabric, will keep the Piz Daint name and provide users with two types of compute nodes: hybrid CPU-GPU and CPU-only nodes. Although slightly reduced in physical size, Piz Daint will be more powerful and flexible allowing simulations or data analyses to be scaled to a few nodes or thousands of nodes.

“We are taking advantage of NVIDIA GPUs to significantly accelerate simulations in such diverse areas as cosmology, materials science, seismology and climatology,” said Thomas Schulthess, professor of computational physics at ETH Zurich and director of CSCS. “Tesla accelerators represent a leap forward in computing, allowing our researchers to solve larger, more complex problems that are currently out of reach in a host of fields.”

Pascal GPUs feature a number of breakthrough technologies, including second-generation High Bandwidth Memory (HBM2) that delivers three times higher bandwidth than the previous generation architecture, and 16nm FinFET technology for unprecedented energy efficiency.

NVIDIA Tesla P100 frontPiz Daint will also incorporate Cray’s DataWarp technology. DataWarp’s so-called Burst Buffer mode quadruples the effective bandwidth for long-term storage; in other words, data is input to and output from storage far more quickly. It paves the way for analyzing millions of small, unstructured files. Consequently, Piz Daint will be able to transfer initial results to a specialized area of the supercomputer for analysis while calculations are still under way.

The upgraded machine will help CSCS carry out its mission of tackling grand challenge science as well as critical applied research. Piz Daint will be used to analyze data from the Large Hadron Collider at CERN, to accelerate research on the Human Brain Project’s High Performance Analytics and Computing Platform, and to continue its work in meteorology and climatology among other domain areas, including deep learning — which was of course a highlight of the NVIDIA event.

“Today a lot of the machine learning work [at ETH Zurich] is happening on workstations and I think the researchers are only now starting to realize that they can actually do this at much bigger scale on our supercomputers,” said Schulthess.

Schulthess bulleted out what he thought were the three were the most important advantages of upgrading to the Pascal architecture and combining the two systems:

  1. Memory Bandwidth. He expects a substantial memory performance increase. “Exactly how big a boost, we will have to find out — probably NVIDIA doesn’t even know yet, but we do expect a big boost on the memory bandwidth. That’s really important because many applications on the GPU are memory bandwidth bound.”
  1. Pascal-Haswell Duo. “The combination of Pascal and Haswell versus K20x and Sandy Bridge is important [now] that we have PCIe Gen3. Imagine you have a job distributed over the GPU memory — a weather code or a climate code, [for example] over the GPU memory of many nodes. Now there is no bottleneck. The GPUs talk to each other with a similar bandwidth. Before the piece between the CPU and the GPU was slow and now the bottleneck is gone.”
  1. Overall Performance. “Pascal is higher performance. I expect that this combination of much better memory bandwidth and faster performance will increase the throughput of the system. And we will open the system to new applications with all these new cool developments that we have today, all these libraries that are coming out of the deep neural network side. Pascal will enable a lot of this.”

All netted out, Schulthess is confident Piz Daint will double performance for both compute and memory bound applications. “We’re not talking about FLOPS; we’re talking about application performance,” he said.

TOP500 the list graphicNot surprisingly, CSCS will again run the LINPACK benchmark on Piz Daint, according to Schulthess, in part for the high profile all supercomputer centers desire but equally because, “LINPACK is very, very good at finding out if there are any hardware problems. It was good last time and I’m sure it will be good for that this time.”

It’s not yet clear how energy efficient the new system will be, but Schulthess thinks it won’t be worse and may be better.

“This whole FLOPS per watt and FLOPS per second is very narrow view of looking at the performance of a system. You have to look at time-to-solution of applications and you have to look at energy-to-solution of applications. In a sense what you’ve want – and I’ve written this in a number of papers already – is for the time-to-solution to be good enough,” he said.

A good example, he noted, are weather forecasts, which need to be completed as quickly as practical so as to make them most useful. “At some point when the time-to-solution is good enough, then you want to really minimize energy to solution (not FLOPS-per-watt),” he agreed.

CSCS is exploring the use of Intel’s forthcoming Xeon Phi, but isn’t ready to comment as the work with Intel is ongoing. Software development is another a major investment area, said Schulthess, “much more important than the hardware. We will actually double up in the future with our investments.” Predictably, CSCS is “looking at everything, also ARM – but that is a whole separate conversation.” Indeed.

Notably, the merging of Piz Dora into Piz Daint opens up tremendous flexibility and is in keeping with the growing trend to create unified platforms able to handle big data analytics as well as traditional modeling and simulation.

For example, one can pre-process data and then scale the simulation up while the data is always on the same system.

“If we need GPU-acceleration for simulations but the CPUs for pre-processing, we move the data from the pre-processing side to the GPU-accelerated side. So you move data between partitions, but you’re doing this per node, at 10 gigabytes-per-second, which is much higher than I/O bandwidth if you go through the disks. We’ll have very high performance for the whole workflow and make things more convenient for the scientists,” said Schulthess.

What’s more, the incorporation of big data analytics tools and practices can help science adopt new approaches. “It’s one thing to bring the data analytics on the systems, but to me there is another very important benefit to the HPC community. The data analytics community is used to a different type of software environment — they like to use Python and SPARK, and in real-time not batches. If we’re able to get supercomputers to run Python and even SPARK, we make them much more usable also to the traditional scientific computing community.”

He cited CSCS work on climate and meteorology as an example, “There’s no reason you wouldn’t want climate scientists to write their models in Python rather than Fortran in the future. Their productivity could go up [significantly] on model development. On an old-style supercomputer, you don’t want to talk about those things. But thanks to the whole data science pressure, we’re creating a software environment that’s much more usable for computational scientists. To me, that’s almost as interesting as the deep learning stuff – enhancing productivity of scientists.”

Turning to the rise of container technology in high-end HPC, perhaps best illustrated by the Docker-Shifter effort at NERSC, Schulthess said CSCS was working with NVIDIA to expose the GPUs in Docker.

Schulthess predicts the revamped Piz Daint will be up and fully running in a year or so, “Our requirements are very high and we are not going to cut corners, but once that is done, moving applications from today’s Piz Daint to the new system, they will just fly — I don’t expect any issues there.” A key reason is Pascal GPUs are backwards compatible. In the words of NVIDIA, “It’s all CUDA; you can use the same application you had five years ago and it just scales up.”

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

InfiniBand Still Tops in Supercomputing

July 19, 2018

In the competitive global HPC landscape, system and processor vendors, nations and end user sites certainly get a lot of attention--deservedly so--but more than ever, the network plays a crucial role. While fast, perform Read more…

By Tiffany Trader

HPC for Life: Genomics, Brain Research, and Beyond

July 19, 2018

During the past few decades, the life sciences have witnessed one landmark discovery after another with the aid of HPC, paving the way toward a new era of personalized treatments based on an individual’s genetic makeup Read more…

By Warren Froelich

WCRP’s New Strategic Plan for Climate Research Highlights the Importance of HPC

July 19, 2018

As climate modeling increasingly leverages exascale computing and researchers warn of an impending computing gap in climate research, the World Climate Research Programme (WCRP) is developing its new Strategic Plan – and high-performance computing is slated to play a critical role. Read more…

By Oliver Peckham

HPE Extreme Performance Solutions

Introducing the First Integrated System Management Software for HPC Clusters from HPE

How do you manage your complex, growing cluster environments? Answer that big challenge with the new HPC cluster management solution: HPE Performance Cluster Manager. Read more…

IBM Accelerated Insights

Are Your Software Licenses Impeding Your Productivity?

In my previous article, Improving chip yield rates with cognitive manufacturing, I highlighted the costs associated with semiconductor manufacturing, and how cognitive methods can yield benefits in both design and manufacture.  Read more…

U.S. Exascale Computing Project Releases Software Technology Progress Report

July 19, 2018

As is often noted the race to exascale computing isn’t just about hardware. This week the U.S. Exascale Computing Project (ECP) released its latest Software Technology (ST) Capability Assessment Report detailing progress so far. Read more…

By John Russell

InfiniBand Still Tops in Supercomputing

July 19, 2018

In the competitive global HPC landscape, system and processor vendors, nations and end user sites certainly get a lot of attention--deservedly so--but more than Read more…

By Tiffany Trader

HPC for Life: Genomics, Brain Research, and Beyond

July 19, 2018

During the past few decades, the life sciences have witnessed one landmark discovery after another with the aid of HPC, paving the way toward a new era of perso Read more…

By Warren Froelich

D-Wave Breaks New Ground in Quantum Simulation

July 16, 2018

Last Friday D-Wave scientists and colleagues published work in Science which they say represents the first fulfillment of Richard Feynman’s 1982 notion that Read more…

By John Russell

AI Thought Leaders on Capitol Hill

July 14, 2018

On Thursday, July 12, the House Committee on Science, Space, and Technology heard from four academic and industry leaders – representatives from Berkeley Lab, Argonne Lab, GE Global Research and Carnegie Mellon University – on the opportunities springing from the intersection of machine learning and advanced-scale computing. Read more…

By Tiffany Trader

HPC Serves as a ‘Rosetta Stone’ for the Information Age

July 12, 2018

In an age defined and transformed by its data, several large-scale scientific instruments around the globe might be viewed as a ‘mother lode’ of precious data. With names seemingly created for a ‘techno-speak’ glossary, these interferometers, cyclotrons, sequencers, solenoids, satellite altimeters, and cryo-electron microscopes are churning out data in previously unthinkable and seemingly incomprehensible quantities -- billions, trillions and quadrillions of bits and bytes of electro-magnetic code. Read more…

By Warren Froelich

Tsinghua Powers Through ISC18 Field

July 10, 2018

Tsinghua University topped all other competitors at the ISC18 Student Cluster Competition with an overall score of 88.43 out of 100. This gives Tsinghua their s Read more…

By Dan Olds

HPE, EPFL Launch Blue Brain 5 Supercomputer

July 10, 2018

HPE and the Ecole Polytechnique Federale de Lausannne (EPFL) Blue Brain Project yesterday introduced Blue Brain 5, a new supercomputer built by HPE, which displ Read more…

By John Russell

Pumping New Life into HPC Clusters, the Case for Liquid Cooling

July 10, 2018

High Performance Computing (HPC) faces some daunting challenges in the coming years as traditional, industry-standard systems push the boundaries of data center Read more…

By Scott Tease

Leading Solution Providers

SC17 Booth Video Tours Playlist

Altair @ SC17


AMD @ SC17


ASRock Rack @ SC17

ASRock Rack



DDN Storage @ SC17

DDN Storage

Huawei @ SC17


IBM @ SC17


IBM Power Systems @ SC17

IBM Power Systems

Intel @ SC17


Lenovo @ SC17


Mellanox Technologies @ SC17

Mellanox Technologies

Microsoft @ SC17


Penguin Computing @ SC17

Penguin Computing

Pure Storage @ SC17

Pure Storage

Supericro @ SC17


Tyan @ SC17


Univa @ SC17


  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This