TACC’s Texascale Days: Empowering Researchers to Tackle Grand Challenges with Frontera

April 3, 2024

April 3, 2024 — One of the biggest thrills for scientists who code is scaling up their simulations to push the limit of the most powerful supercomputers. Texascale Days at the Texas Advanced Computing Center (TACC) gives scientists that rare opportunity.

The quarterly event awards a handful of research groups full use of the National Science Foundation-funded Frontera supercomputer, the fastest supercomputer at any U.S. university and the leading capability system in the national cyberinfrastructure intended for large applications that require thousands of compute nodes.

Texascale Days on TACC’s Frontera supercomputer gives scientists full access to the most powerful supercomputer at any U.S. university. Shown is a volume rendered view of a thin equatorial slice of the model 25 Mʘ star. Credit: Paul Woodward, University of Minnesota.

“Texascale Days gives the researcher an opportunity to run code on problems at a scale that is not available during regular production on any NSF system,” said John Cazes, director of High Performance Computing at TACC.

Normally, at any given time, dozens of scientists share Frontera and run scientific computational jobs that need less than a quarter of its 8,300 Intel Cascade Lake Xeon nodes, supplemented by 90 graphics processing unit (GPU) nodes of NVIDIA Quadro RTX 5000. Allocations are requested through the Frontera user portal and the National Artificial Intelligence Research Resource.

“Texascale Days is different in that the simulations have demonstrated through smaller jobs that they can scale up to at least half the nodes on Frontera. It takes quite a bit of expertise and work to optimize the researcher’s code to hit that scale,” Cazes added.

Following are highlights of production and benchmarking runs from the latest Texascale Days in February 2024.

Wish Upon a Star

The magnitude of the horizontal velocity component in stellar convection simulations is shown volume rendered in a thin slice through the center of a 25 Mʘ star at intervals of 8.202 days, beginning at 8.202 days. The internal gravity waves excited by the core convection zone grow and work their way outward in time to influence the wave oscillations in the entire stably stratified envelope between the two convection zones. Credit: Paul Woodward, University of Minnesota.

The team of astronomer Paul Woodward at the University of Minnesota, in collaboration with Falk Herwig’s team at the University of Victoria, has been studying convection and its effects on the deep interiors of massive stars for several years. The gravity wave oscillations that can be seen using instruments like the Kepler space telescope and the Transiting Exoplanet Survey Satellite can provide a unique window into the interior structure of massive stars.

“These internal gravity waves (IGWs) can provide a connection between simulations and observations,” Woodward said.

Stellar hydrodynamic simulations have demonstrated that IGWs are excited by convection in the stellar core, and Woodward’s team has shown that features in the spectrum of excited waves and their stochastic time dependence bear a resemblance to the low-frequency excess that is observed.

However, open questions about the origins of IGWs prevent the scientific community from fully exploiting asteroseismic observations of massive stars. To resolve this question, the teams need simulations that reveal how the low-frequency waves are excited by core convection in the inner regions of the stable layer.

“The fine simulation grids and the high computational performance made possible with our PPMstar code running on Frontera enables us to resolve both the core convection, the near-surface convection, and the proper excitation and damping of IGWs in the stably stratified envelope between these convection zones,” Woodward said. “Our team exploited the most recent Texascale Days opportunity to perform some first experiments at scale to maximum of 3,510 nodes, in which we include nearly the entire star in our computational domain.”

“These are our first simulations at scale of full star models,” he added. “We learn from these numerical experiments how much gravity wave signals can tell us about the structure of a massive star’s deep interior.”

Cosmic History

The ASTRID cosmological simulation models large volumes of the cosmos spanning hundreds of millions of light years yet can zoom in to very high resolution. Credit: ASTRID team.

ASTRID, one of the largest-ever cosmological simulations, was developed on Frontera, and it too had its day during Texascale Days. It maxed out Frontera at 8,192 nodes during the peak of the simulation. The goal is to study galaxy formation, supermassive black hole coalescence, and re-ionization over the cosmic history.

“The Texascale Days run was very successful,” said Nianyi Chen of Carnegie Mellon University, “and utilized an optimized version of our cosmological hydrodynamics code MP-Gadget. “We evolved the ASTRID simulation by about 100 million years while efficiently processing galaxy and black hole catalogs on the fly.”, The science team includes Tiziana Di Matteo (CMU); Simeon Bird (UC Riverside); Yueying Ni (Harvard); and graduate students Yihao Zhou (CMU) and Yanhui Yang (UC Riverside).

“We finished massively parallelized I/O for a total of a few hundred terabytes of data during the 24-hour run. The adaptation of our code to the Frontera cores produced a speed-up of about 10 percent on our problem,” she added.

“The Texascale Day resources are crucial for this part of the ASTRID production run: our simulation is at the peak of cosmic star formation, and we need a larger memory to accommodate the information from the newly formed stars and galaxies. It provides a precious opportunity to test the scalability and reliability of our simulation code in a massively parallel context, allowing us to make further improvements to our simulation code for robust performance on large machines like Frontera and continue to push the simulation to the present day universe,” Chen said.

That’s A Moiré

The average computed density of the electron comprising one of the strongly bound excitonic states in a 55 atom silver nanoparticle (overlaid). Credit: The Jornada Group.

When two layers of atomically thin materials overlap, they can produce a moiré pattern that creates intriguing electronic phenomena such as superconductivity and ferromagnetism. What’s more, bouncing light off overlapping sheets of exotic materials can produce excitons, which are quasiparticles being studied for applications in new optical sensors and communication technology such as optical fibers and lasers.

“Using TACC’s Frontera supercomputer, we performed first-principles density functional theory calculations of the electronic ground state energies and wave functions for a plasmonic nanoparticle of experimentally relevant size,” said Felipe Jornada, an assistant professor in the Department of Materials Science and Engineering at Stanford University and a principal investigator at the SLAC National Accelerator Laboratory.

The Jornada Group needed over 4,000 nodes of Frontera to capture the atomistic details in these nanoparticles and the complex way that their electrons interact with light, using computationally demanding quantum-mechanical theories.

“This is the first calculation of its kind that addresses the intricate nature of the atomic structure of such nanoparticles, and the resultant correlations left behind in the electronic system after the photoexcitation,” Jornada said.

Plasmonic nanoparticles can be used to drive chemical reactions such as the production of ammonia fertilizer and hydrogen fuel, as well as plastic decomposition, using light instead of costly high temperature and pressure conditions created by burning fossil fuels.

“We think this is an exciting time where our theories, codes, and computational resources finally let us make practical predictions for new, light-driven chemical reactions,” added PH.D. student Akash Ramdas in the Jornada Group.

Going Nuclear

Density and the shape of the 24Mg ground state from the first-principle nuclear theory calculations. The apparent deformation of 24Mg nucleus is visible from the simulation obtained during the February 2024 Texascale Days on Frontera. Credit: Kristina Launey, LSU.

The isotope magnesium-24 (24Mg) is a heavy hitter in the universe. It’s one of the 10 most common elements in our galaxy and is vital in the synthesis of nuclei that form stars. During the Texascale Days event, a team led by Kristina Launey at Louisiana State University and Grigor Sargsyan at Michigan State University performed several large-scale simulations for the atomic nucleus of 24Mg across the entire 8,000+ nodes of Frontera.

Launey’s team uses a many-body method based on first principle approaches, which takes into account the underlying interactions of protons and neutrons. (use live link) Descriptions of alpha-conjugate nucleus — nuclei with multiples of alpha particles, i.e., two protons and two neutrons, such as 24Mg — are challenging to derive from first principle approaches.

“Texascale Days allowed us to utilize the full power of one of the largest supercomputers in the world to expand the first-principle simulations to heavier and more challenging nuclei,” Launey said.

“Almost all chemical elements on Earth have been created in the stars thanks to complex chains of nuclear processes,” said Grigor Sargsyan, Michigan State University, who is a co-PI on the Frontera allocation and a member of Launey’s team that does these first-principle calculations.

“To understand how these chains proceed, a reliable description of nuclear properties is needed. Thanks to the modern-day supercomputers and the advances in nuclear modeling, we are greatly expanding our knowledge of nuclear properties and complement the measurements at the state-of-the-art nuclear physics laboratories,” Sargsyan said.

A New Vista

The large-scale experiences gained from Texascale Days on Frontera apply to new systems on the horizon for TACC, such as Vista, slated for production in Summer of 2024 with an artificial intelligence focus.

“Texascale Days have been a great success for TACC in helping stress-test our flagship system, and for researchers in optimizing their codes to run at scales of the largest supercomputers in the world,” Cazes said. “We look forward to more years of Texascale Days on Frontera and on new, exciting systems to come.”


Source: Jorge Salazar, TACC

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

MLPerf Training 4.0 – Nvidia Still King; Power and LLM Fine Tuning Added

June 12, 2024

There are really two stories packaged in the most recent MLPerf  Training 4.0 results, released today. The first, of course, is the results. Nvidia (currently king of accelerated computing) wins again, sweeping all nine Read more…

Highlights from GlobusWorld 2024: The Conference for Reimagining Research IT

June 11, 2024

The Globus user conference, now in its 22nd year, brought together over 180 researchers, system administrators, developers, and IT leaders from 55 top research computing centers, national labs, federal agencies, and univ Read more…

Nvidia Shipped 3.76 Million Data-center GPUs in 2023, According to Study

June 10, 2024

Nvidia had an explosive 2023 in data-center GPU shipments, which totaled roughly 3.76 million units, according to a study conducted by semiconductor analyst firm TechInsights. Nvidia's GPU shipments in 2023 grew by more Read more…

Weekly Wire Roundup: June 2-June 7, 2024

June 8, 2024

Computex (and Jensen Huang) gave us an extra day of news this week, compensating for last week's shorter, holiday-driven news cycle. On Sunday ahead of the official start of Computex, Nvidia's CEO Jensen Huang deliver Read more…

ASC24 Expert Perspective: Dongarra, Hoefler, Yong Lin

June 7, 2024

One of the great things about being at an ASC (Asia Supercomputer Community) cluster competition is getting the chance to interview various industry experts and learning more about the various challenges the students are Read more…

HPC and Climate: Coastal Hurricanes Around the World Are Intensifying Faster

June 6, 2024

Hurricanes are among the world's most destructive natural hazards. Their environment shapes their ability to deliver damage; conditions like warm ocean waters, guiding winds, and atmospheric moisture can all dictate stor Read more…

MLPerf Training 4.0 – Nvidia Still King; Power and LLM Fine Tuning Added

June 12, 2024

There are really two stories packaged in the most recent MLPerf  Training 4.0 results, released today. The first, of course, is the results. Nvidia (currently Read more…

Highlights from GlobusWorld 2024: The Conference for Reimagining Research IT

June 11, 2024

The Globus user conference, now in its 22nd year, brought together over 180 researchers, system administrators, developers, and IT leaders from 55 top research Read more…

Nvidia Shipped 3.76 Million Data-center GPUs in 2023, According to Study

June 10, 2024

Nvidia had an explosive 2023 in data-center GPU shipments, which totaled roughly 3.76 million units, according to a study conducted by semiconductor analyst fir Read more…

ASC24 Expert Perspective: Dongarra, Hoefler, Yong Lin

June 7, 2024

One of the great things about being at an ASC (Asia Supercomputer Community) cluster competition is getting the chance to interview various industry experts and Read more…

HPC and Climate: Coastal Hurricanes Around the World Are Intensifying Faster

June 6, 2024

Hurricanes are among the world's most destructive natural hazards. Their environment shapes their ability to deliver damage; conditions like warm ocean waters, Read more…

ASC24: The Battle, The Apps, and The Competitors

June 5, 2024

The ASC24 (Asia Supercomputer Community) Student Cluster Competition was one for the ages. More than 350 university teams worked for months in the preliminary competition to earn one of the 25 final competition slots. The winning teams... Read more…

Computex 2024: Nvidia, AMD Push GPUs; Intel Revs Up x86 Power Efficiency

June 5, 2024

"The days of millions of GPU data centers are coming," said Nvidia CEO Jensen Huang during a keynote at Computex. Huang's predictions are becoming bolder and bo Read more…

Using AI and Robots to Advance Science

June 4, 2024

Even though we invented it, humans can be pretty bad at science. We need to eat and sleep, we sometimes let our emotions regulate our behavior, and our bodies a Read more…

Atos Outlines Plans to Get Acquired, and a Path Forward

May 21, 2024

Atos – via its subsidiary Eviden – is the second major supercomputer maker outside of HPE, while others have largely dropped out. The lack of integrators and Atos' financial turmoil have the HPC market worried. If Atos goes under, HPE will be the only major option for building large-scale systems. Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Everyone Except Nvidia Forms Ultra Accelerator Link (UALink) Consortium

May 30, 2024

Consider the GPU. An island of SIMD greatness that makes light work of matrix math. Originally designed to rapidly paint dots on a computer monitor, it was then Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Some Reasons Why Aurora Didn’t Take First Place in the Top500 List

May 15, 2024

The makers of the Aurora supercomputer, which is housed at the Argonne National Laboratory, gave some reasons why the system didn't make the top spot on the Top Read more…

Leading Solution Providers

Contributors

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

The NASA Black Hole Plunge

May 7, 2024

We have all thought about it. No one has done it, but now, thanks to HPC, we see what it looks like. Hold on to your feet because NASA has released videos of wh Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

Google Announces Sixth-generation AI Chip, a TPU Called Trillium

May 17, 2024

On Tuesday May 14th, Google announced its sixth-generation TPU (tensor processing unit) called Trillium.  The chip, essentially a TPU v6, is the company's l Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

Intel’s Next-gen Falcon Shores Coming Out in Late 2025 

April 30, 2024

It's a long wait for customers hanging on for Intel's next-generation GPU, Falcon Shores, which will be released in late 2025.  "Then we have a rich, a very Read more…

Intel Plans Falcon Shores 2 GPU Supercomputing Chip for 2026  

August 8, 2023

Intel is planning to onboard a new version of the Falcon Shores chip in 2026, which is code-named Falcon Shores 2. The new product was announced by CEO Pat Gel Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire