NCSA Signs Up Cray for Blue Waters Redo

By Michael Feldman

November 14, 2011

The National Center for Supercomputing Applications (NCSA) has awarded Cray a $188 million contract to complete the NSF-funded Blue Waters supercomputer project at the University of Illinois. An 11.5 petaflops Cray XE6/XK6 hybrid system outfitted with AMD CPUs and NVIDIA GPUs will be deployed next year and become the center’s petascale resource for open science and engineering. The much-anticipated deal was announced on Monday, just as the Supercomputing Conference (SC11) in Seattle got underway.

This is NCSA’s second shot at Blue Waters. In 2007, IBM was selected to build the petascale machine as part of the NSF Track 1 leadership system program. That system, a Power7-based supercomputer, based on IBM’s DARPA-funded PERCS architecture, was to deliver 10 peak petaflops and one petaflop of sustained performance. The IBM machine was on track to be deployed in 2012.

Some IBM cabinets had already been delivered in 2011, when in August, the company abruptly terminated the contract after determining the effort required to complete the work would ultimately be unprofitable to the company. Following IBM’s embarrassing withdrawal, NCSA and NSF re-solicited the work, hoping to put the project back on schedule with a new vendor.

According to NCSA Director Thom Dunning, the solicitation attracted Cray, along with three other bidders, who he declined to name. Dunning told HPCwire that Cray’s approach lined up very well with the Blue Waters’ mission. “As we have always done, we didn’t pick a system that was just focused on peak performance, but a system that focused on sustained performance and had the memory and disk performance that is really needed by the science and engineering community,” said Dunning.

Cray CEO Peter Ungaro reiterated that point, noting that the Blue Waters project is a great fit for his company’s vision of adaptive and heterogeneous computing at scale, and with a customer that is focused on sustained performance rather than raw flops or Linpack benchmarks. “NCSA and NSF could have made a lot of tradeoffs to build a much bigger machine from a peaks flops standpoint and to get a better ranking on the TOP500,” said Ungaro.

The addition of GPU acceleration was brought in at the behest of the researchers who are gearing up to use the Blue Waters system. In fact, according to Dunning, two thirds of the researchers who are in line to run their application on the machine are now asking for these accelerators, which influenced NCSA’s choice to go with Cray’s XE6/XK6 hybrid supercomputer. Over the past five years, some of these researchers have ported portions of their science codes to take advantage of GPGPUs. “That was the one major change that occurred between 2006 and 2011,” said Dunning.

That said, the supercomputer will mostly rely in CPUs. Cray estimates the system will have more than 235 cabinets of CPU-only XE6 cabinets and over 30 cabinets of CPU-GPU XK6 cabinets. In both cases, the CPU will be AMD’s “Interlagos” Opteron 6200 processor, which was officially launched on Monday. Specifically, the machine will be outfitted with the 16-core 2.3 GHz Opteron 6276.

In aggregate, more than 49,000 of these CPUs will be used in the machine, representing about two-thirds of the total flops. The remaining third will be supplied by more than 3,000 “Kepler” GPUs, NVIDIA’s next-generation graphics processor that is expected to go into production in 2012. Dunning said the CPUs alone will be enough to sustain one petaflop of performance on science applications capable of scaling to that level. If those codes can employ GPUs effectively, an additional performance boost will be possible.

The supercomputer will be impressive in nearly every other dimension as well. The machine will have more than 1.5 PB of total memory, an aggregate I/O bandwidth of over 1 TB/s, and an enormous interconnect network, with about 4,500 km of wires. Cray will also be supplying more than 25 petabytes of external storage integrated with the Lustre file system. “It’s going to be the biggest supercomputer we’ve ever built.” said Ungaro.

The application work for Blue Waters will span the breadth of big science applications, in particular, those in molecular science, climate/weather forecasting, earth science, life sciences, and astrophysics. As Ungaro implied, NCSA and the NSF could have built a much larger machine from a pure flops perspective if they had maximized the GPU components, but instead felt that the CPU-heavy mix matches the current state of these science codes much more closely at this point.

NCSA and Cray are planning to stick to the same deployment schedule as was being pursued with the IBM PERCS machine, with the final system up and running by next fall. The general plan is to deploy the CPU nodes first, with the GPU components installed during the last stage of the build-out.

Cray expects to book most of the $188 million contract money in 2012, but the funding  includes five years of Blue Waters services and support. This time around there is no termination clause in the contract, which from Ungaro’s point of view is not an issue. Delivering supercomputing to scientists is essentially Cray’s whole business, so there really no consideration that they wouldn’t follow through. “It was the easiest part of the negotiation.” laughed Ungaro.
 
In next couple of weeks, a small Interlagos-based test system will be installed, followed in early 2012, by a much larger machine. This will allow the researchers to work on optimizing and scaling their applications, at least with the CPU components. By the middle of the summer, they will have the full system deployed, with the exception of the “Kepler” GPUs, which are expected to arrive in early fall. If all goes as planned, the entire system should be up and running by this time next year.

After being in limbo since August, Dunning is eager to move forward with the project, adding that he and his team are delighted to work with Cray. “We’ve been waiting for four years to put hardware on the floor that the science and engineering teams could use, and it’s finally happening,” he said.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Watch Nvidia’s GTC21 Keynote with Jensen Huang Livestreamed Here at HPCwire

April 9, 2021

Join HPCwire right here on Monday, April 12, at 8:30 am PT to see the Nvidia GTC21 keynote from Nvidia’s CEO, Jensen Huang, livestreamed in its entirety. Hosted by HPCwire, you can click to join the Huang keynote on our livestream to hear Nvidia’s expected news and... Read more…

The US Places Seven Additional Chinese Supercomputing Entities on Blacklist

April 8, 2021

As tensions between the U.S. and China continue to simmer, the U.S. government today added seven Chinese supercomputing entities to an economic blacklist. The U.S. Entity List bars U.S. firms from supplying key technolog Read more…

Argonne Supercomputing Supports Caterpillar Engine Design

April 8, 2021

Diesel fuels still account for nearly ten percent of all energy-related U.S. carbon emissions – most of them from heavy-duty vehicles like trucks and construction equipment. Energy efficiency is key to these machines, Read more…

Habana’s AI Silicon Comes to San Diego Supercomputer Center

April 8, 2021

Habana Labs, an Intel-owned AI company, has partnered with server maker Supermicro to provide high-performance, high-efficiency AI computing in the form of new training and inference servers that will power the upcoming Read more…

Intel Partners Debut Latest Servers Based on the New Intel Gen 3 ‘Ice Lake’ Xeons

April 7, 2021

Fresh from Intel’s launch of the company’s latest third-generation Xeon Scalable “Ice Lake” processors on April 6 (Tuesday), Intel server partners Cisco, Dell EMC, HPE and Lenovo simultaneously unveiled their first server models built around the latest chips. And though arch-rival AMD may... Read more…

AWS Solution Channel

Volkswagen Passenger Cars Uses NICE DCV for High-Performance 3D Remote Visualization

 

Volkswagen Passenger Cars has been one of the world’s largest car manufacturers for over 70 years. The company delivers more than 6 million automobiles to global customers every year, from 50 production locations on five continents. Read more…

What’s New in HPC Research: Tundra, Fugaku, µHPC & More

April 6, 2021

In this regular feature, HPCwire highlights newly published research in the high-performance computing community and related domains. From parallel programming to exascale to quantum computing, the details are here. Read more…

The US Places Seven Additional Chinese Supercomputing Entities on Blacklist

April 8, 2021

As tensions between the U.S. and China continue to simmer, the U.S. government today added seven Chinese supercomputing entities to an economic blacklist. The U Read more…

Habana’s AI Silicon Comes to San Diego Supercomputer Center

April 8, 2021

Habana Labs, an Intel-owned AI company, has partnered with server maker Supermicro to provide high-performance, high-efficiency AI computing in the form of new Read more…

Intel Partners Debut Latest Servers Based on the New Intel Gen 3 ‘Ice Lake’ Xeons

April 7, 2021

Fresh from Intel’s launch of the company’s latest third-generation Xeon Scalable “Ice Lake” processors on April 6 (Tuesday), Intel server partners Cisco, Dell EMC, HPE and Lenovo simultaneously unveiled their first server models built around the latest chips. And though arch-rival AMD may... Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

HPE Launches Storage Line Loaded with IBM’s Spectrum Scale File System

April 6, 2021

HPE today launched a new family of storage solutions bundled with IBM’s Spectrum Scale Erasure Code Edition parallel file system (description below) and featu Read more…

RIKEN’s Ongoing COVID Research Includes New Vaccines, New Tests & More

April 6, 2021

RIKEN took the supercomputing world by storm last summer when it launched Fugaku – which became (and remains) the world’s most powerful supercomputer – ne Read more…

CERN Is Betting Big on Exascale

April 1, 2021

The European Organization for Nuclear Research (CERN) involves 23 countries, 15,000 researchers, billions of dollars a year, and the biggest machine in the worl Read more…

AI Systems Summit Keynote: Brace for System Level Heterogeneity Says de Supinski

April 1, 2021

Heterogeneous computing has quickly come to mean packing a couple of CPUs and one-or-many accelerators, mostly GPUs, onto the same node. Today, a one-such-node system has become the standard AI server offered by dozens of vendors. This is not to diminish the many advances... Read more…

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

CERN Is Betting Big on Exascale

April 1, 2021

The European Organization for Nuclear Research (CERN) involves 23 countries, 15,000 researchers, billions of dollars a year, and the biggest machine in the worl Read more…

Programming the Soon-to-Be World’s Fastest Supercomputer, Frontier

January 5, 2021

What’s it like designing an app for the world’s fastest supercomputer, set to come online in the United States in 2021? The University of Delaware’s Sunita Chandrasekaran is leading an elite international team in just that task. Chandrasekaran, assistant professor of computer and information sciences, recently was named... Read more…

HPE Launches Storage Line Loaded with IBM’s Spectrum Scale File System

April 6, 2021

HPE today launched a new family of storage solutions bundled with IBM’s Spectrum Scale Erasure Code Edition parallel file system (description below) and featu Read more…

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

Saudi Aramco Unveils Dammam 7, Its New Top Ten Supercomputer

January 21, 2021

By revenue, oil and gas giant Saudi Aramco is one of the largest companies in the world, and it has historically employed commensurate amounts of supercomputing Read more…

Quantum Computer Start-up IonQ Plans IPO via SPAC

March 8, 2021

IonQ, a Maryland-based quantum computing start-up working with ion trap technology, plans to go public via a Special Purpose Acquisition Company (SPAC) merger a Read more…

Leading Solution Providers

Contributors

Can Deep Learning Replace Numerical Weather Prediction?

March 3, 2021

Numerical weather prediction (NWP) is a mainstay of supercomputing. Some of the first applications of the first supercomputers dealt with climate modeling, and Read more…

Livermore’s El Capitan Supercomputer to Debut HPE ‘Rabbit’ Near Node Local Storage

February 18, 2021

A near node local storage innovation called Rabbit factored heavily into Lawrence Livermore National Laboratory’s decision to select Cray’s proposal for its CORAL-2 machine, the lab’s first exascale-class supercomputer, El Capitan. Details of this new storage technology were revealed... Read more…

New Deep Learning Algorithm Solves Rubik’s Cube

July 25, 2018

Solving (and attempting to solve) Rubik’s Cube has delighted millions of puzzle lovers since 1974 when the cube was invented by Hungarian sculptor and archite Read more…

African Supercomputing Center Inaugurates ‘Toubkal,’ Most Powerful Supercomputer on the Continent

February 25, 2021

Historically, Africa hasn’t exactly been synonymous with supercomputing. There are only a handful of supercomputers on the continent, with few ranking on the Read more…

The History of Supercomputing vs. COVID-19

March 9, 2021

The COVID-19 pandemic poses a greater challenge to the high-performance computing community than any before. HPCwire's coverage of the supercomputing response t Read more…

HPE Names Justin Hotard New HPC Chief as Pete Ungaro Departs

March 2, 2021

HPE CEO Antonio Neri announced today (March 2, 2021) the appointment of Justin Hotard as general manager of HPC, mission critical solutions and labs, effective Read more…

Microsoft, HPE Bringing AI, Edge, Cloud to Earth Orbit in Preparation for Mars Missions

February 12, 2021

The International Space Station will soon get a delivery of powerful AI, edge and cloud computing tools from HPE and Microsoft Azure to expand technology experi Read more…

AMD Launches Epyc ‘Milan’ with 19 SKUs for HPC, Enterprise and Hyperscale

March 15, 2021

At a virtual launch event held today (Monday), AMD revealed its third-generation Epyc “Milan” CPU lineup: a set of 19 SKUs -- including the flagship 64-core, 280-watt 7763 part --  aimed at HPC, enterprise and cloud workloads. Notably, the third-gen Epyc Milan chips achieve 19 percent... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire