CEED Annual Meeting: Exascale Challenges Met with Innovative Advanced Math Solutions

September 26, 2023

Sept. 26, 2023 — Since kicking off in 2016, the Department of Energy’s (DOE) Exascale Computing Project (ECP) has prepared high performance computing (HPC) software and scientific applications for exascale-class supercomputers, which are powerful enough to process a quintillion (1018) calculations per second. During this time, Oak Ridge National Lab deployed the nation’s first exascale system known as Frontier, soon to be joined by Argonne National Lab’s Aurora and Livermore’s El Capitan.

A crucial part of the ECP’s success are co-design centers, which tackle exascale-related challenges common among the DOE complex and its collaborators. Optimizing a scientific application’s underlying mathematical solutions leads to better performance, so the Center for Efficient Exascale Discretizations (CEED, pronounced “seed”) focuses on efficient numerical methods and discretization strategies for the exascale era. Read a CEED retrospective in the article ECP co-design center wraps up seven years of collaboration.

Led by LLNL computational mathematician Tzanio Kolev, CEED recently held its seventh and final annual meeting—an event nicknamed CEED7AM. Hosted at LLNL’s University of California Livermore Collaboration Center and including online attendees, the meeting featured breakout discussions, more than two dozen speakers, and an evening of bocce ball. LLNL administrators Linda Becker, Kathy Hernandez Jimenez, Jessica Rasmussen, and Haley Shuey helped organize the event.

Common Threads

Interest in the center’s progress and results goes beyond the ECP. Among the participants in the August 1–3 meeting were researchers from 7 national labs and 24 universities, plus several commercial companies and a few student interns. Kolev stated, “Our users benefit from the experience and wisdom of multiple teams with different history and different perspectives.”

Code portability and GPU performance were common threads through many presentations. CEED team members and collaborators alike described their evaluation of and improvements to performance on multiple types of computing systems including Frontier, El Capitan’s early access systems, and the Perlmutter supercomputer at the National Energy Research Scientific Computing Center. Speakers also provided a glimpse into applications that rely on CEED software, such as additive manufacturing, radiation hydrodynamics, and fusion reactors.

Exascale Optimization

Tim Warburton, director of Virginia Tech’s Parallel Numerical Algorithms research group and the John K. Costain Faculty Science Chair, described CEED’s libParanumal project, which provides experimental finite element solvers for heterogeneous computing architectures. libParanumal features a variety of finely tuned linear solvers—an efficiency optimization Warburton compared to building a fast car. “It’s not enough to tune just the engine. You also need good brakes [error estimates], weight reduction [mixed precision], and improved roads [vendor co-design],” he said.

Researchers use libParanumal’s benchmarking capabilities to “do the detective work” on HPC systems, Warburton explained. He summarized CEED’s software development alongside the HPC industry’s hardware advances, noting, “GPUs have become more powerful during the center’s lifetime, and we add benchmarking tests as GPUs find new ways to surprise us. We have to look at the impact of our software design choices every time a new hardware component or architecture change is introduced.”

Indeed, CEED projects like libParanumal have been tested on several different architectures including early access systems for Frontier. Emphasizing the purpose of co-design, Kolev added, “Being part of the ECP has given us credibility with vendors who will listen to what we have to say.”

CEED’s libParanumal finite element software is built with “good brakes”—algorithms that terminate when stopping conditions are met, such as in this convergence of error estimates for an isotropic mesh problem. See the preprint Stopping Criteria for the Conjugate Gradient Algorithm in High-Order Finite Element Methods.

Multiscale Efficiency

Numerical methods play a vital role in climate and weather research at the U.S. Naval Postgraduate School (NPS), where students and faculty in the Computational Mathematics Laboratory are exploring ways to refine the resolution of atmospheric circulation models. Frank Giraldo, NPS distinguished professor and Applied Mathematics Chair, presented his team’s work on hurricane simulations.

“Tropical cyclone rapid intensification, where wind velocities increase quickly in a short period, is a difficult and important problem,” Giraldo explained. “We’re trying to understand hurricanes better and track them in a full global simulation.” For instance, hurricanes can be hundreds of kilometers in size; to properly capture the phenomena, large-eddy simulations should resolve at about 100 meters. The NPS-developed Nonhydrostatic Unified Model of the Atmosphere (NUMA) and its lightweight version (xNUMA) solve coarse- and fine-scale problems with a multiscale modeling framework, dynamic adaptive mesh refinement, and time integration strategies.

However, multiscale modeling requires many fine-scale simulations for each coarse-scale element. Giraldo pointed out, “These simulations are too computationally expensive to run on a regular basis, but you need the ability to do multiple simulations in order to advance science.” To reduce computational costs, his team is investigating solutions that leverage larger time-steps, reduced order models, or machine learning techniques.

Regardless of the approach, Giraldo emphasized that GPUs are necessary for processing complex 3D atmospheric simulations—particularly to achieve the National Weather Service’s mandate of generating a 24-hour forecast in just 8 wall-clock minutes. In past work, collaborating with Warburton and leveraging CEED’s OCCA portability library, Giraldo’s team successfully ran the NUMA code on Oak Ridge’s GPU-based Titan supercomputer.

Naval fleets depend on accurate oceanographic and meteorological predictions, and NUMA is the basis for the U.S. Navy’s NEPTUNE global atmospheric model, which will become operational in 2024. “The Navy has developed their own weather models for as long as numerical weather prediction has existed,” Giraldo noted.

Figure 3: Compared to the commonly used Weather Research and Forecasting (WRF) model (a,b), NUMA (c,d) resolves finer horizontal wind speed features at the same spatiotemporal points in a hurricane simulation. See the paper The Effects of Numerical Dissipation on Hurricane Rapid Intensification with Observational Heating.

‘A Banner Year’

The CEED team hails from Livermore and Argonne national labs and five universities: Rensselaer Polytechnic Institute, University of Colorado at Boulder, University of Illinois Urbana-Champaign, University of Tennessee, and Virginia Tech. “The meeting’s large turnout, including many collaborators and users beyond our team, indicates that the scientific community is enthusiastic about what we’ve done and continue to do,” noted Kolev. “I enjoyed all the talks and learned something from all of them. There were some very interesting ideas that I want to follow up on.”

Warburton says he has been continually impressed by CEED’s progress over the years. He reflected, “Despite being located across the country, the leadership team has managed the project superlatively, and as everyone gathered to showcase their achievements, it was clear this was a banner year for the collaboration.” First-time attendee Giraldo added, “The quality of the presentations, as well as the discussions during and after, were all at the highest level.”


Source: Holly Auten, LLNL

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

The IBM-Meta AI Alliance Promotes Safe and Open AI Progress

December 5, 2023

IBM and Meta have co-launched a massive industry-academic-government alliance to shepherd AI development. The new group has united under the AI Alliance banner to promote responsible innovation in AI. Historically, techn Read more…

ChatGPT Friendly Programming Languages
(hello-world.llm)

December 4, 2023

 Using OpenAI's ChatGPT to write code is an alluring goal. Describing "what to" solve, but not "how to solve" would be a huge breakthrough in computer programming. Alas, we are nowhere near this capability. In particula Read more…

IBM Quantum Summit: Two New QPUs, Upgraded Qiskit, 10-year Roadmap and More

December 4, 2023

IBM kicks off its annual Quantum Summit today and will announce a broad range of advances including its much-anticipated 1121-qubit Condor QPU, a smaller 133-qubit Heron QPU, that’s optimized for combining with multipl Read more…

The Annual SCinet Mandala

November 30, 2023

Perhaps you have seen images of Tibetan Buddhists creating beautiful and intricate images with colored sand. These sand mandalas can take weeks to create, only to be ritualistically dismantled when the image is finished. Read more…

Alibaba Shuts Down its Quantum Computing Effort

November 30, 2023

In case you missed it, China’s e-commerce giant Alibaba has shut down its quantum computing research effort. It’s not entirely clear what drove the change. Reuters’ reported earlier this week that Alibaba “cut a Read more…

AWS Solution Channel

Shutterstock 2030529413

Reezocar Rethinks Car Buying Using Computer Vision and ML on AWS

Overview

Every car that finds its way to a landfill marks another dent in the fight for a sustainable future. Reezocar, an online hub for buying and selling used cars, has a mission to change this. Read more…

QCT Solution Channel

QCT and Intel Codeveloped QCT DevCloud Program to Jumpstart HPC and AI Development

Organizations and developers face a variety of issues in developing and testing HPC and AI applications. Challenges they face can range from simply having access to a wide variety of hardware, frameworks, and toolkits to time spent on installation, development, testing, and troubleshooting which can lead to increases in cost. Read more…

SC23: The Ethics of Supercomputing

November 29, 2023

Why should HPC practitioners care about ethics? And, what are our ethics in HPC? These questions were central to a lively discussion at the SC23 Birds-of-a-Feather (BoF) session: With Great Power Comes Great Responsib Read more…

The IBM-Meta AI Alliance Promotes Safe and Open AI Progress

December 5, 2023

IBM and Meta have co-launched a massive industry-academic-government alliance to shepherd AI development. The new group has united under the AI Alliance banner Read more…

Shutterstock 1336284338

ChatGPT Friendly Programming Languages
(hello-world.llm)

December 4, 2023

 Using OpenAI's ChatGPT to write code is an alluring goal. Describing "what to" solve, but not "how to solve" would be a huge breakthrough in computer programm Read more…

IBM Quantum Summit: Two New QPUs, Upgraded Qiskit, 10-year Roadmap and More

December 4, 2023

IBM kicks off its annual Quantum Summit today and will announce a broad range of advances including its much-anticipated 1121-qubit Condor QPU, a smaller 133-qu Read more…

The Annual SCinet Mandala

November 30, 2023

Perhaps you have seen images of Tibetan Buddhists creating beautiful and intricate images with colored sand. These sand mandalas can take weeks to create, only Read more…

SC23: The Ethics of Supercomputing

November 29, 2023

Why should HPC practitioners care about ethics? And, what are our ethics in HPC? These questions were central to a lively discussion at the SC23 Birds-of-a-Fe Read more…

Grace Hopper’s Big Debut in AWS Cloud While Graviton4 Launches

November 29, 2023

Editors Note: Additional Coverage of the AWS-Nvidia 65 Exaflop ‘Ultra-Cluster’ and Graviton4 can be found on our sister site Datanami. Amazon Web Service Read more…

Analyst Panel Says Take the Quantum Computing Plunge Now…

November 27, 2023

Should you start exploring quantum computing? Yes, said a panel of analysts convened at Tabor Communications HPC and AI on Wall Street conference earlier this y Read more…

SCREAM wins Gordon Bell Climate Prize at SC23

November 21, 2023

The first Gordon Bell Prize for Climate Modeling was presented at SC23 in Denver. The award went to a team led by Sandia National Laboratories that had develope Read more…

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

Leading Solution Providers

Contributors

SC23 Booth Videos

Achronix @ SC23
AMD @ SC23
AWS @ SC23
Altair @ SC23
CoolIT @ SC23
Cornelis Networks @ SC23
CoreHive @ SC23
DDC @ SC23
HPE @ SC23 with Justin Hotard
HPE @ SC23 with Trish Damkroger
Intel @ SC23
Intelligent Light @ SC23
Lenovo @ SC23
Penguin Solutions @ SC23
QCT Intel @ SC23
Tyan AMD @ SC23
Tyan Intel @ SC23
HPCwire LIVE from SC23 Playlist

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire