GPUs Will Morph ORNL’s Jaguar Into 20-Petaflop Titan

By Michael Feldman

October 11, 2011

Jaguar’s days as a CPU-only supercomputer are numbered. Over the next year, the 2.3 petaflop machine at the Department of Energy’s Oak Ridge National Lab (ORNL) will be upgraded by Cray with the new NVIDIA “Kepler” GPUs, producing a system with about 10 times Jaguar’s peak performance. The transformed supercomputer will be renamed Titan and should deliver in the neigborhood of 20 peak petaflops sometime in late 2012.

The current Jaguar system, which has already been upgraded numerous times since it was first deployed in 2009, currently sits at number three on the TOP500 list with a Linpack reading of 1.76 petaflops. Titan will certainly keep the machine in the top 5, even as machines with tens of petaflops start making their way into the big labs over the next couple years.

Titan will also represent the US entry in the echelons of top tier GPU-accelerated supercomputing. As it stands today, three of the top five systems are GPU accelerated: Tianhe-1A and Nebulae in China, and TSUBAME 2.0 in Japan. The current top GPU machine in the US is Edge, a 240-teraflop Appro cluster at Lawrence Livermore National Laboratory. Even Russia, Germany, Italy have larger systems.

According to Steve Scott, the newly minted chief technology officer for NVIDIA’s Tesla Business Unit, the fact that ORNL is making such a significant commitment to GPU computing is a big endorsement for the architecture. It’s no secret that HPC is now constrained by energy use. Moore’s Law has managed to shrink the transistor geometries, but the power wall has become the defining limitation for performance increases. “It’s all about power efficiency” Scott told HPCwire, “which is why we think the GPU story is so compelling.”

While GPUs are not truly general-purpose processors, their ability to perform data-parallel computation in a much more energy-efficient manner than CPUs has vaulted them to prominence in the HPC realm. “It’s hard to overstate the importance of the sea change that has happened in high performance computing,” notes Scott. “This wonderful ride we’ve been on for the past 30 years — every time we halve the size of transistor, the voltage drops, power stays the same, and performance improves exponentially — has been fantastic, but it’s done.”

Although the US, in general, has been a bit late in embracing GPU technology for HPC, the Titan supercomputer has been on the drawing board at Oak Ridge for at least a couple of years. But the technology necessary to implement that machine is just now catching up with those requirements.

Beginning this fall, most of 18,688 of Jaguar’s current XT5 nodes will be retrofitted with Cray’s new XK6 blades, which the company unveiled in May. The immediate result is that the current dual-socket 6-core AMD Opteron nodes will be swapped out for a single 16-core “Interlagos” CPU node and the interconnect will upgraded from SeaStar 2 to Gemini. Each XK6 blade encompasses four compute nodes, with an Opteron on each one, and the ability to connect each of those CPUs to a Tesla GPU on a PCIe daughter card.

Initially, 960 of those XK6 nodes will be outfitted with the Fermi-class Tesla M2090 GPUs, with the other odd 17 thousand remaining as CPU-only blades for the time being. This first phase of Titan is expected to be completed before the end of the year. Then in the second half of 2012, all 18,688 nodes, including the original Fermi-equipped blades, will be populated with NVIDIA’s next-generation Kepler Teslas.

NVIDIA has not provided detailed specs on the Kepler GPUs, but according to Scott their performance per watt will be more be than double that of the Fermi parts, while fitting into the same power envelope. Given the current Fermi Tesla cards (GPUs plus memory) deliver 665 gigaflops, the new Kepler GPU should yield at least 1330 gigaflops.

For the time being, Oak Ridge is promising only 10 to 20 petaflops for the final system, although the peak performance could go considerably higher. According to Buddy Bland, project director at ORNL’s Leadership Computing Facility, they currently don’t have the money in hand to upgrade all 18K nodes. The actual scope of the Titan build-out will “depend on the budget available.”

Theoretically though, if all existing nodes are populated with the new Kepler parts, the system should deliver at least 24.8 petaflops of GPU power. An equal number of Interlagos CPUs should contribute more than two additional petaflops on top of that. By the time all the dust has settled, Titan could be within spitting distance of 30 petaflops. 

The amount of power the new system will draw is also unknown, but it will certainly have a better performance per watt ratio than Jaguar, which sucks up nearly 7 MW for its 2.33 peak petaflops. By contrast, Japan’s Fermi-accelerated TSUBAME system uses just 1.4 MW for its 2.29 petaflops. Since ORNL’s new machine will use the more efficient Kepler GPUs, its efficiency should be significantly better. “We view Titan as the leading indicator of where people are going as they look to solve the energy challenges for the next five to ten years,” says Scott.

How all those peak flops turn into actual application performance remains to be seen. Extracting high levels of sustained computation from these multi-petaflop machines is notoriously difficult, with only a handful of codes able to attain more than a petaflop of performance. Adding GPUs to the mix has made that harder, at least in the short term.

In this regard, Oak Ridge, with one of the premier computational lab’s on the planet, has a good chance of pushing the envelope. Using smaller GPU clusters, computations scientists at ORNL and elsewhere have been busy porting six flagship science codes to CUDA, include Wang-Landau/LSMS for material science; S3D for engine combustion; PFLOTRAN for underground C02 sequestration and for underground contaminant containment; Denovo for radiation transport code in nuclear engineering; CAM-SE for climate change modeling; and LAMMPS, a molecular dynamics simulation code. Scott says ORNL, Cray and NVIDIA have been working together to adapt these science codes for heterogenous computing so that they are ready to go when Titan boots up.

This first phase of Titan is expected to generate more than $60 million in revenue for Cray, which could end up in the company’s hands before the end of the year. Over the lifetime of the contract, Cray is looking to collect more than $97 million, although if upgrade options are exercised, that number could go considerably higher.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

HPC User Forum: Sustainability at TACC Points to Software

October 3, 2023

Recently, Dan Stanzione, Executive Director, TACC and Associate Vice President for Research, UT-Austin, gave a presentation on HPC sustainability at the Fall 2023 HPC Users Forum. The complete set of slides is available Read more…

Google’s Controversial AI Chip Paper Under Scrutiny Again 

October 3, 2023

A controversial research paper by Google that claimed the superiority of AI techniques in creating chips is under the microscope for the authenticity of its claims. Science publication Nature is investigating Google's c Read more…

Rust Busting: IBM and Boeing Battle Corrosion with Simulations on Quantum Computer

October 3, 2023

The steady research into developing real-world applications for quantum computing is piling up interesting use cases. Today, IBM reported on work with Boeing to simulate corrosion processes to improve composites used in Read more…

Nvidia Delivering New Options for MLPerf and HPC Performance

September 28, 2023

As HPCwire reported recently, the latest MLperf benchmarks are out. Not unsurprisingly, Nvidia was the leader across many categories. The HGX H100 GPU systems, which contain eight H100 GPUs, delivered the highest throughput on every MLPerf inference test in this round. Read more…

Hakeem Oluseyi Explores His Unlikely Journey from the Street to the Stars in SC23 Keynote

September 28, 2023

Defying the odds In the heart of one of the toughest neighborhoods in the country, young Hakeem Oluseyi’s world was a confined space, but his imagination soared to the stars. While other kids roamed the streets, he Read more…

AWS Solution Channel

Shutterstock 2338659951

VorTech Derisks Innovative Technology to Aid Global Water Sustainability Challenges Using Cloud-Native Simulations on AWS

Overview

No more than 1 percent of the world’s water is readily available fresh water, according to the US Geological Survey. Read more…

QCT Solution Channel

QCT and Intel Codeveloped QCT DevCloud Program to Jumpstart HPC and AI Development

Organizations and developers face a variety of issues in developing and testing HPC and AI applications. Challenges they face can range from simply having access to a wide variety of hardware, frameworks, and toolkits to time spent on installation, development, testing, and troubleshooting which can lead to increases in cost. Read more…

Nvidia Takes Another Shot at Trying to Get AI to Mobile Devices

September 28, 2023

Nvidia takes another shot at trying to get to mobile devices Long before the current situation of Nvidia's GPUs holding AI hostage, the company tried to put its chips in mobile devices but failed. The Tegra mobile chi Read more…

Shutterstock 1927423355

Google’s Controversial AI Chip Paper Under Scrutiny Again 

October 3, 2023

A controversial research paper by Google that claimed the superiority of AI techniques in creating chips is under the microscope for the authenticity of its cla Read more…

Rust Busting: IBM and Boeing Battle Corrosion with Simulations on Quantum Computer

October 3, 2023

The steady research into developing real-world applications for quantum computing is piling up interesting use cases. Today, IBM reported on work with Boeing to Read more…

Nvidia Delivering New Options for MLPerf and HPC Performance

September 28, 2023

As HPCwire reported recently, the latest MLperf benchmarks are out. Not unsurprisingly, Nvidia was the leader across many categories. The HGX H100 GPU systems, which contain eight H100 GPUs, delivered the highest throughput on every MLPerf inference test in this round. Read more…

IonQ Announces 2 New Quantum Systems; Suggests Quantum Advantage is Nearing

September 27, 2023

It’s been a busy week for IonQ, the quantum computing start-up focused on developing trapped-ion-based systems. At the Quantum World Congress today, the compa Read more…

Rethinking ‘Open’ for AI

September 27, 2023

What does “open” mean in the context of AI? Must we accept hidden layers? Do copyrights and patents still hold sway? And do consumers have the right to opt Read more…

Aurora Image

Leveraging Machine Learning in Dark Matter Research for the Aurora Exascale System 

September 25, 2023

Scientists have unlocked many secrets about particle interactions at atomic and subatomic levels. However, one mystery that has eluded researchers is dark matte Read more…

Watsonx Brings AI Visibility to Banking Systems

September 21, 2023

A new set of AI-based code conversion tools is available with IBM watsonx. Before introducing the new "watsonx," let's talk about the previous generation Watson Read more…

Intel’s Gelsinger Lays Out Vision and Map at Innovation 2023 Conference

September 20, 2023

Intel’s sprawling, optimistic vision for the future was on full display yesterday in CEO Pat Gelsinger’s opening keynote at the Intel Innovation 2023 confer Read more…

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

Leading Solution Providers

Contributors

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

ISC 2023 Booth Videos

Cornelis Networks @ ISC23
Dell Technologies @ ISC23
Intel @ ISC23
Lenovo @ ISC23
Microsoft @ ISC23
ISC23 Playlist
  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire