Materials Science Simulation Achieves Extreme Performance at NERSC

September 8, 2022

Sept. 8, 2022 — Using the new Perlmutter system at the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory (Berkeley Lab), a team of researchers led by Paderborn University scientists Thomas D. Kühne and Christian Plessl used a new mixed-precision method to conduct the first electronic structure simulation that executed more than quintillion (1018) operations per second (exaops). The team’s mixed-precision method is well-suited to running on Perlmutter’s thousands of GPU processors.

Simulation graphic of the COVID-19 spike protein simulated in aqueous solution, with the hydrogen and oxygen atoms removed. Credit: NERSC.

Of the quintillion-operations milestone, Plessl said: “The dimension of this number becomes clearer when you consider that the universe is about 1018 seconds old. That means that if a human had performed a calculation every second since the time of the Big Bang, this calculation does the same work in a single second.”

Scientific simulations typically use “64-bit” arithmetic  to achieve the high-precision results needed to represent physical systems and processes. The Paderborn team was able to show that some real-world problems of interest can use lower-precision arithmetic for some operations using their new method, a method that takes great advantage of the “tensor” cores on Perlmutter’s NVIDIA A100 GPU accelerators.

The calculation used 4,400 GPUs on Perlmutter to perform a simulation of the SARS-CoV-2 spike protein. Kühne and Plessl used the submatrix method they introduced in 2020 for the approximate calculations. In this method, complex chemical calculations are broken down into independent pieces performed on small dense matrices. Because it uses many nodes working on smaller problems at once — what computing scientists call parallelism — the method lends itself to efficiency and scaling up and down for differently sized uses.

“What’s neat about it is that it’s a method that’s inherently extremely parallel, so it’s extremely scalable,” said Plessl. “And that’s the reason we’re able to target the largest supercomputers in the world using this method. The other benefit of the method is that it’s very suitable for GPUs because it kind of converts a problem that is a sparse-matrix problem that is hard to solve on a CPU to a very parallel implementation where you can work on much smaller dense matrices. From a computer science perspective, I think it’s quite exciting.”

“However, people in the high-performance community have been a little bit critical about approximate approaches like our submatrix method,” said Kühne of the speed of their calculation. “It appeared nearly too good to be true, that is to say, we reached a very high degree of efficiency, allowing us to conduct complex atomistic simulations that were so far considered to be not feasible. Yet, having access to Perlmutter gave us the opportunity to demonstrate that it really works in a real application, and we can really exploit all the positive aspects of the technique as advertised, and it actually works.”

Kühne and Plessl approached NERSC after the June 2021 Top500 performance ranking of supercomputers ranked Perlmutter as number five in the world. There, they worked with Application Performance Specialist Paul Lin, who helped set them up for success by orienting them to the system and helping to ensure that their code would run smoothly on Perlmutter.

One major challenge, Lin said, was running complex code on such a new system, as Perlmutter was at the time.

“On a brand-new system, it’s both challenging but also especially exciting to see science teams achieve groundbreaking scientific discoveries,” said Lin. “These types of simulations also help the center tune the system during deployment.”

Kühne and Plessl ran their calculations using the code CP2K, an open-source molecular dynamics code used by many NERSC users and others in the field. When they’re finished, they plan to write up and release their process for using the code on NERSC so that other users can learn from their experience. And when that’s done, they’ll keep working on the code itself.

“We’re just in the process of defining road maps for the further development of the CP2K simulation code,” said Plessl. “We’re getting more and more invested in developing the code, and making it more GPU-capable, and also more scalable and for more use cases — so NERSC users will profit from this work as well.”

As for the record, it’s an exciting development and a glimpse of what Perlmutter will be able to do for all kinds of science research going forward.

“We knew the system was capable of one exaop at this precision level, but it was exciting to see a real science application do it, particularly one that’s a traditional science application,” said NERSC Application Performance Group Lead Jack Deslippe, who also helped oversee the project. “We have a lot of applications now that are doing machine learning and deep learning, and they are the ones that tend to have up to this point been able to use the special hardware that gets you to this level. But to see a traditional materials-science modeling and simulation application achieve this performance was really exciting.”

This story contains information originally published in a Paderborn University news release.

About NERSC and Berkeley Lab

The National Energy Research Scientific Computing Center (NERSC) is a U.S. Department of Energy Office of Science User Facility that serves as the primary high-performance computing center for scientific research sponsored by the Office of Science. Located at Lawrence Berkeley National Laboratory, the NERSC Center serves more than 7,000 scientists at national laboratories and universities researching a wide range of problems in combustion, climate modeling, fusion energy, materials science, physics, chemistry, computational biology, and other disciplines. Berkeley Lab is a DOE national laboratory located in Berkeley, California. It conducts unclassified scientific research and is managed by the University of California for the U.S. Department of Energy.


Source: Elizabeth Ball, NERSC

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Edge-to-Cloud: Exploring an HPC Expedition in Self-Driving Learning

April 25, 2024

The journey begins as Kate Keahey's wandering path unfolds, leading to improbable events. Keahey, Senior Scientist at Argonne National Laboratory and the University of Chicago, leads Chameleon. This innovative projec Read more…

Quantum Internet: Tsinghua Researchers’ New Memory Framework could be Game-Changer

April 25, 2024

Researchers from the Center for Quantum Information (CQI), Tsinghua University, Beijing, have reported successful development and testing of a new programmable quantum memory framework. “This work provides a promising Read more…

Intel’s Silicon Brain System a Blueprint for Future AI Computing Architectures

April 24, 2024

Intel is releasing a whole arsenal of AI chips and systems hoping something will stick in the market. Its latest entry is a neuromorphic system called Hala Point. The system includes Intel's research chip called Loihi 2, Read more…

Anders Dam Jensen on HPC Sovereignty, Sustainability, and JU Progress

April 23, 2024

The recent 2024 EuroHPC Summit meeting took place in Antwerp, with attendance substantially up since 2023 to 750 participants. HPCwire asked Intersect360 Research senior analyst Steve Conway, who closely tracks HPC, AI, Read more…

AI Saves the Planet this Earth Day

April 22, 2024

Earth Day was originally conceived as a day of reflection. Our planet’s life-sustaining properties are unlike any other celestial body that we’ve observed, and this day of contemplation is meant to provide all of us Read more…

Intel Announces Hala Point – World’s Largest Neuromorphic System for Sustainable AI

April 22, 2024

As we find ourselves on the brink of a technological revolution, the need for efficient and sustainable computing solutions has never been more critical.  A computer system that can mimic the way humans process and s Read more…

Shutterstock 1748437547

Edge-to-Cloud: Exploring an HPC Expedition in Self-Driving Learning

April 25, 2024

The journey begins as Kate Keahey's wandering path unfolds, leading to improbable events. Keahey, Senior Scientist at Argonne National Laboratory and the Uni Read more…

Quantum Internet: Tsinghua Researchers’ New Memory Framework could be Game-Changer

April 25, 2024

Researchers from the Center for Quantum Information (CQI), Tsinghua University, Beijing, have reported successful development and testing of a new programmable Read more…

Intel’s Silicon Brain System a Blueprint for Future AI Computing Architectures

April 24, 2024

Intel is releasing a whole arsenal of AI chips and systems hoping something will stick in the market. Its latest entry is a neuromorphic system called Hala Poin Read more…

Anders Dam Jensen on HPC Sovereignty, Sustainability, and JU Progress

April 23, 2024

The recent 2024 EuroHPC Summit meeting took place in Antwerp, with attendance substantially up since 2023 to 750 participants. HPCwire asked Intersect360 Resear Read more…

AI Saves the Planet this Earth Day

April 22, 2024

Earth Day was originally conceived as a day of reflection. Our planet’s life-sustaining properties are unlike any other celestial body that we’ve observed, Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that ha Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use o Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Leading Solution Providers

Contributors

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

Intel Plans Falcon Shores 2 GPU Supercomputing Chip for 2026  

August 8, 2023

Intel is planning to onboard a new version of the Falcon Shores chip in 2026, which is code-named Falcon Shores 2. The new product was announced by CEO Pat Gel Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire