Run-Up to Petaflops

By Thomas Sterling and Chirag Dekate

June 18, 2008

There is no other way to characterize this time in high performance computing: 2008 will be remembered as “the year” — the year that one petaflops was achieved in Linpack performance. It is a milestone that has been anticipated for almost a decade and a half, and one that was accomplished through the synthesis of two big trends that have emerged as the driving forces for HPC in the last few years — multicore and heterogeneous computing.

But there is much more to the events, technical advances, and new initiatives in HPC internationally throughout the last year than simply a single number, no matter how dramatic the milestone. The theme for this year, “Run-Up to Petaflops,” has involved a series of interrelated advances in technology, component architecture, and planning for large scale systems that has inaugurated the Petaflops Era. Briefly some of these contributing events are considered here.

This year has marked the next stage in the transition to “multicore, the new Moore’s Law” which was last year’s theme. Four-core sockets are replacing dual-core as we enter the second generation of the multicore technology base. AMD’s Barcelona quad-core chips are now available with new systems being configured to support them and some early generation systems being upgraded to exploit them for a mid-life kicker. The Intel Clovertown chip, also a quad-core Xeon processor, is now being incorporated as well. From IBM, the new Power6 architecture on 65 nanometer technology is designed to be configured with up to 16 cores and is establishing new industry clock rates from 3.5 GHz to 4.7 GHz.

The move to 45 nanometer technology has been a hallmark of 2008 with major vendor offerings being announced and prepared for delivery for the second half of this year. Intel’s new fabrication line in Chandler, Ariz., will provide high-volume manufacturing of 45 nanometer components. Intel introduced its hafnium-based high-k metal gate silicon technology for unprecedented low-leakage current. The Dunnington Intel processor will be produced by this process, and will be available in the second half of this year with six cores per socket. AMD’s 45 nanometer fab in Dresden, Germany, which uses full-field EUV lithography, will produce the quad-core “Shanghai” by the second half of 2008. This is to be followed by the six-core Istanbul processor in 2009. IBM is projected to release the Power7 Processor in 2010, which has been developed in part with DARPA HPCS funding.

Heterogeneous computing in its various forms has captured the imagination of the supercomputing community with the excitement of outstanding raw performance, tempered only by a realistic concern about programming methodologies. ClearSpeed has introduced its second generation SIMD attached array processor, significantly improving its interconnect bandwidth and optimizing the average power dissipation. The ClearSpeed accelerators are an important component in the Japanese TSUBAME 100 teraflops system. NVIDIA is moving toward a GPU in every PC with its GeForce series delivering 10x or better speed-ups on some application kernels. IBM has introduced its important upgrade to the original Cell architecture used in the Sony Playstation3 game product. The new PowerXCell 8i processor chip combines both heterogeneity and multicore to provide a tour de force in processor technology. But most important to the supercomputing community and market is its upgraded SPE core that includes full 64-bit floating point arithmetic units at 12.8 gigaflops peak performance. That works out to 100 gigaflops across the eight SPE cores, which are integrated with a separate PowerPC core for general services.

Over the last year, the international community has established a multi-initiative, world-wide set of programs to harness the power of these technologies to deliver petaflops capability into the hands of real-world users in science, technology, commerce, and defense applications. In the last year, the fastest general-purpose machine, Blue Gene/L at LLNL, was upgraded by IBM to exceed half a petaflops peak performance, delivering 478 teraflops of sustained Linpack performance. The fastest machine in Europe is the next generation of this family of systems, Blue Gene/P at the Julich Research Centre in Germany. Called “JUGENE,” this system of almost a quarter of a petaflops peak capability has delivered 167 teraflops sustained with 32 terabytes of main memory. This new Blue Gene generation system incorporates the new 850 MHz quad-core PowerPC 450.

The trend of upgrading existing systems has proved to be an important path to extending the useful lifetime of major systems, providing superior capability at a fraction of the cost to end users and agencies. The 124 teraflops Red Storm system at Sandia National Laboratory that was the prototype for the major line of XT Cray systems is scheduled to be augmented to a peak capability of between 250 to 284 teraflops, using quad-core AMD Opterons. And the Earth Simulator, one of the most important systems on the TOP500 list is to be upgraded by NEC to a full capability of 131 teraflops by early next year.

In Japan, the new Keisoku program will be managed by Riken and will involve the collaboration of Hitachi, NEC, and Fujitsu. The goal is to build a 10 petaflops machine to be deployed in Kobe in 2012.

The U.S. National Science Foundation has selected IBM to provide its leadership-class “Blue Waters” system to be deployed at UIUC in 2011. That system is to be based on technology developed under the IBM PERCS project, which is sponsored by the DARPA HPCS Program. NSF will also install a second mid-range HPC system in Tennessee based on advanced Cray architecture.

In 2007, India deployed its first top 10 system, named “Eka,” at the Computational Research Laboratories, Tata Sons. That machine uses the HP Blade Cluster Platform 3000 BL460c and delivers a peak performance of 170 teraflops. China continues its steady advance in the HPC arena with the installation of a series of significant terascale systems, including a 38 teraflops Intel Woodcrest-based IBM BladeCenter. Equally interesting is the development of their Loongson-2E CPU chip on 90 nanometer process technology.

But the big news — well timed for ISC — is Roadrunner, the fastest machine in the world and the first system to achieve one petaflops Linpack performance. Roadrunner, which will be deployed at Los Alamos National Laboratory, was developed under DOE contract by IBM and marks the first major system to rely principally on a heterogeneous architecture to achieve its performance. Based on the IBM PowerXCell 8i described above, and the AMD Opteron, this breakthrough machine delivers 1.3 petaflops peak performance.

Even as the achievement of a petaflops is being heralded as the entry into a new era of high performance computing, the challenges of exascale computing are being explored by the community. As reported last year, both DOE and DARPA undertook to study the application, technology, system requirements, and implications of sustained exaflops computer implementation and operation. The studies demonstrated the importance of such capability to many applications critical to science, technology, and society. But these early investigations also exposed the daunting technological challenges confronting any such endeavor.

While numbers can vary significantly depending on underlying assumptions, representative estimates from a number of sources suggest power consumption in the range of 120 megawatts (+/- 50 percent), concurrency at the multi-billion-way level of parallelism, number of cores between 100 million and 500 million, and system-wide latencies in the tens of thousands of cycles.

The expected dates for such systems are as aggressive as the middle of next decade. Extrapolation of the TOP500 list suggests a deployment at the end of the decade. With concerted effort, an ambitious but not unrealistic deployment could occur in 2018. But this will require real research investment programs to be initiated within the next year and a half. It is hard to believe, but it may be possible that the authors will be writing an HPCwire article a decade from now about the year that was the “Run-Up to Exaflops.”

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Mystery Solved: Intel’s Former HPC Chief Now Running Software Engineering Group 

April 15, 2024

Last year, Jeff McVeigh, Intel's readily available leader of the high-performance computing group, suddenly went silent, with no interviews granted or appearances at press conferences.  It led to questions -- what's Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Institute for Human-Centered AI (HAI) put out a yearly report to t Read more…

Crossing the Quantum Threshold: The Path to 10,000 Qubits

April 15, 2024

Editor’s Note: Why do qubit count and quality matter? What’s the difference between physical qubits and logical qubits? Quantum computer vendors toss these terms and numbers around as indicators of the strengths of t Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips are available off the shelf, a concern raised at many recent Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announced its second fund targeting €200 million. The very idea th Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. In a way, Nvidia is the new Intel IDF, the hottest chip show Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announce Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. I Read more…

Google Announces Homegrown ARM-based CPUs 

April 9, 2024

Google sprang a surprise at the ongoing Google Next Cloud conference by introducing its own ARM-based CPU called Axion, which will be offered to customers in it Read more…

Computational Chemistry Needs To Be Sustainable, Too

April 8, 2024

A diverse group of computational chemists is encouraging the research community to embrace a sustainable software ecosystem. That's the message behind a recent Read more…

Hyperion Research: Eleven HPC Predictions for 2024

April 4, 2024

HPCwire is happy to announce a new series with Hyperion Research  - a fact-based market research firm focusing on the HPC market. In addition to providing mark Read more…

Google Making Major Changes in AI Operations to Pull in Cash from Gemini

April 4, 2024

Over the last week, Google has made some under-the-radar changes, including appointing a new leader for AI development, which suggests the company is taking its Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

Leading Solution Providers

Contributors

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

Intel’s Xeon General Manager Talks about Server Chips 

January 2, 2024

Intel is talking data-center growth and is done digging graves for its dead enterprise products, including GPUs, storage, and networking products, which fell to Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire