NCAR Prepares for Derecho, Its Third-Generation Weather and Climate Supercomputer

By Oliver Peckham

September 29, 2021

Derechos, the namesake of the new supercomputer coming to the National Center for Atmospheric Research (NCAR), are fast-moving, widespread bands of thunderstorms. Indeed, NCAR itself is moving quickly and ambitiously with the new system – its third major installation since 2012. Irfan Elahi, director of NCAR’s high-performance computing division, recently spoke to HPCwire about the development and timeline for Derecho, as well as plans further into the future.

The nuts and bolts

First, the specs: Derecho, built by HPE, will be water-cooled and predominantly powered by third-generation AMD Epyc Milan CPUs and Nvidia’s 40GB A100 GPUs, with 2,488 CPU-only dual-socket nodes (256GB of memory per node) and 82 single-socket heterogeneous nodes (four A100s and 512GB of memory per node). In total, the system is equipped with 692TB of total memory, 328 A100 GPUs and 5,058 Milan CPUs, all connected by HPE Slingshot v11 networking.

This combined hardware will deliver 19.87 peak petaflops – more than triple the performance of Derecho’s predecessor, Cheyenne (5.34 peak petaflops). Cheyenne, installed in 2016, was itself preceded by the 1.26-peak petaflops Yellowstone system in 2012.

The Cheyenne system. Image courtesy of NCAR.

Derecho’s firepower will be deployed in service of all things atmospheric and many things environmental, with Elahi noting applications ranging from severe weather (thunderstorms, tornadoes and hurricanes) and climate change to water availability, wildfires, renewable energy, subsurface flow of oil and gas, solar storms and more. “Mainly the supercomputer will enable research that will lead to more detailed and useful prediction capabilities which will have significant societal benefit,” he said, “especially by getting more resilient to climate change.”

A gathering storm

HPE’s win was announced in January of this year, but plans for Derecho had been in the works for years prior. “We kicked off this project in … late summer 2018, and we started by doing a workload analysis study,” Elahi said. “We also then created a panel … and I think that it had 43 different members – diverse both in terms of gender [and] ethnicity but also in subject matter expertise, because we wanted to … look at this subject matter expertise across the earth systems sciences, and we worked with the Science Requirement Advisory Panel to get their requirements[.]”

Through this process, NCAR developed a suite of benchmarks for measuring a new system, which was then referred to as NWSC-3 after the NCAR-Wyoming Supercomputer Center (NWSC) where NCAR’s supercomputers have been housed. With the benchmarks in hand, NCAR put out an RFI for the system, working with “four or five” potential vendors, including through a workshop that brought together researchers and vendors to strategize for system-building. After the RFP was issued and the dust settled, Elahi said that NCAR selected for the best value – not the lowest cost – and landed on HPE.

Updating the forecast

Derecho is still, however, a little ways off. First, a test system — Gust — will launch around February of 2022. Then, Elahi said, “Derecho itself is going to be delivered sometime mid-March – the first quarter of next year.” Once it’s installed and tested – a six- to seven-week process, he said – NCAR will open the system to its inaugural external users. These users will come from the Accelerated Science Discovery (ASD) program, which is soliciting proposals from researchers whose projects involve “actionable science” of relevance to NCAR’s core objectives. These ASD users, Elahi explained, would help beta test the system for a couple of months over the summer before its wider launch and help NCAR to push itself into the next generation of supercomputing. “The whole idea about ASD is these new, upcoming applications,” he said.

“After ASD, we will open it to all of the user community,” Elahi continued. However, Cheyenne – which Elahi noted has been a remarkably reliable system, with just one power outage over the last few years – will continue running until “sometime in late December of 2022.” “In order to help our users transition and migrate to the new environment, we want to provide them an overlap of six months,” Elahi said.

Derecho will be housed in the NWSC datacenter, which Elahi said was LEED Gold certified. “The most important thing, I think, is application energy efficiency,” he said, speaking to the sustainability of the new system. “Because … growth in peak or sustained computing capability is just one thing – but power efficiency is another.” So, he said, Derecho will produce around three to three and a half times more flops per watt compared to Cheyenne.

Looking further into the future, Elahi noted that computer processing was stagnating, pointing instead to technologies like accelerators, GPUs, FPGAs and AI as the sources of greater computing power and efficiency. And, he said, NCAR would be looking to push the efficiency angle even harder for its fourth system. “One of the things we want to do for our next system is also to look at carbon footprint and sustainability as a specification,” he said.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

HPC Career Notes: April 2023 Edition

April 1, 2023

In this monthly feature, we’ll keep you up-to-date on the latest career developments for individuals in the high-performance computing community. Whether it’s a promotion, new company hire, or even an accolade, we’ Read more…

Q&A with Dorian C. Arnold, SC23 General Chair, and an HPCwire Person to Watch in 2023

March 31, 2023

SC23 General Chair Dorian C. Arnold is enthusiastic about this year's conference, which will take place Nov. 12-17 in Denver, Colo. Our exclusive interview with Arnold covers his history with the annual event, what's in store for attendees, and his insights into the HPC landscape writ large. In addition to his work with SC, Arnold is also... Read more…

Intel Issues Roadmap Update, Aims for ‘Scheduled Predictability’

March 30, 2023

Intel held an investor webinar yesterday, with the chip giant working to project consistency and confidence amid slipping roadmaps and market share. At the event, Intel primarily focused on where it stands with four (!) Read more…

Intel’s Server Chips Are ‘Lead Vehicles’ for Manufacturing Strategy

March 30, 2023

…But chipmaker still does not have an integrated product strategy, which puts the company behind AMD and Nvidia. Intel finally has a full complement of server and PC chips it will release in the coming years, which will determine whether it has regained its leadership in chip manufacturing. The chipmaker this week... Read more…

JPMorgan Chase, QC Ware Report Progress in Quantum DL for Deep Hedging

March 30, 2023

Hedging is, of course, a ubiquitous practice in FS and there are well-developed classical computational approaches for implementing this risk mitigation strategy. The challenge has been the computational cost and time-to Read more…

AWS Solution Channel

Shutterstock 531739477

Checkpointing HPC applications using the Spot Instance two-minute notification from Amazon EC2

Amazon Elastic Compute Cloud (Amazon EC2) offers a wide-range of compute instances at different price points, all designed to match different customer’s needs. You can further optimize cost by choosing Reserved Instances (RIs) and even Spot Instances. Read more…

 

Get the latest on AI innovation at NVIDIA GTC

Join Microsoft at NVIDIA GTC, a free online global technology conference, March 20 – 23 to learn how organizations of any size can power AI innovation with purpose-built cloud infrastructure from Microsoft. Read more…

Destination Earth Takes Form as EuroHPC’s Flagship Workload

March 30, 2023

When the EuroHPC Summit was held last week in Gothenburg, there was a distinct shift in tone for the maturing supercomputing play. With LUMI and Leonardo – plus four other petascale systems – already operational, the Read more…

Intel Issues Roadmap Update, Aims for ‘Scheduled Predictability’

March 30, 2023

Intel held an investor webinar yesterday, with the chip giant working to project consistency and confidence amid slipping roadmaps and market share. At the even Read more…

Intel’s Server Chips Are ‘Lead Vehicles’ for Manufacturing Strategy

March 30, 2023

…But chipmaker still does not have an integrated product strategy, which puts the company behind AMD and Nvidia. Intel finally has a full complement of server and PC chips it will release in the coming years, which will determine whether it has regained its leadership in chip manufacturing. The chipmaker this week... Read more…

Destination Earth Takes Form as EuroHPC’s Flagship Workload

March 30, 2023

When the EuroHPC Summit was held last week in Gothenburg, there was a distinct shift in tone for the maturing supercomputing play. With LUMI and Leonardo – pl Read more…

What’s Stirring in Nvidia’s R&D Lab? Chief Scientist Bill Dally Provides a Peek

March 28, 2023

In what’s become a regular GPU Technology Conference feature, Bill Dally, Nvidia chief scientist and SVP of research, provides a glimpse into how Nvidia organ Read more…

Cost-effective Fork of GPT-3 Released to Scientists

March 28, 2023

Researchers looking to create a foundation for a ChatGPT-style application now have an affordable way to do so. Cerebras is releasing open-source learning models for researchers with the ingredients necessary to cook up their own ChatGPT-AI applications. The open-source tools include seven models that form a learning... Read more…

Pegasus ‘Big Memory’ Supercomputer Now Deployed at the University of Tsukuba

March 25, 2023

In the bevy of news from Nvidia's GPU Technology Conference this week, another new system has come to light: Pegasus, which entered operations at the University Read more…

EuroHPC Summit: Tackling Exascale, Energy, Industry & Sovereignty

March 24, 2023

As the 2023 EuroHPC Summit opened in Gothenburg on Monday, Herbert Zeisel – chair of EuroHPC’s Governing Board – commented that the undertaking had “lef Read more…

Nvidia Doubling Down on China Market in the Face of Tightened US Export Controls

March 23, 2023

Chipmakers are tightlipped on China activities following a U.S. crackdown on hardware exports to the country. But Nvidia remains unfazed, and is doubling down o Read more…

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

Leading Solution Providers

Contributors

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

SC22 Booth Videos

AMD @ SC22
Altair @ SC22
AWS @ SC22
Ayar Labs @ SC22
CoolIT @ SC22
Cornelis Networks @ SC22
DDN @ SC22
Dell Technologies @ SC22
HPE @ SC22
Intel @ SC22
Intelligent Light @ SC22
Lancium @ SC22
Lenovo @ SC22
Microsoft and NVIDIA @ SC22
One Stop Systems @ SC22
Penguin Solutions @ SC22
QCT @ SC22
Supermicro @ SC22
Tuxera @ SC22
Tyan Computer @ SC22
  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire