Eyes on the Prize: TACC’s Frontera Quickly Ramps up Science Agenda

By John Russell

September 9, 2019

Announced a year ago and officially launched a week ago, the Texas Advanced Computing Center’s Frontera – now the fastest academic supercomputer (~25 peteflops, Linpack) – is already establishing an aggressive agenda as early-science teams prepare to port their scientific codes. TACC announced a few details of early science projects for Frontera spanning medicine, cosmology, energy research, quantum chemistry, and turbulence simulations.

Given the cost of these machines ($60 million for Frontera) there has been a stronger mandate in recent years for leadership systems such as Frontera to dig into meaningful projects fast.

Frontera is certainly an impressive machine: It is powered by Intel’s highest-bin SP Cascade Lake processors (the Xeon Platinum 8280s), interconnected with HDR-100 links to each node, and full 200 Gbps HDR links between leaf and core switches. Data Direct Networks contributed the main storage system (50+ PB disk, 3PB of flash, 1.5/TB sec of I/O capability). (See HPCwire article, Fastest Academic Supercomputer Enters Full Production at TACC, Just in Time for Hurricane Season)

Here are brief descriptions of six of the early science programs planned for Frontera, followed below by a bit more on each project taken from articles on the TACC website and lightly edited (links to TACC articles about each project are also below):

  • Cancer Research. George Biros, professor at The University of Texas at Austin, with joint appointments at the Oden Institute for Computational Engineering and Sciences and the Walker Department of Mechanical Engineering, is leading an effort to apply massive, high-speed computers, machine learning, and biophysical models of cells to the problem of diagnosing and treating gliomas.
  • Solar Energy. Ganesh Balasubramanian, an assistant professor of Mechanical Engineering and Mechanics at Lehigh University, is studying the dynamics of organic photovoltaic materials. He is working to develop efficient ways to create next generation flexible solar photovoltaics that can exceed the energy-producing potential of today’s devices.
  • Quantum Chemistry. Olexandr Isayev, an assistant professor of chemistry at the University of North Carolina at Chapel Hill, is focused on solving chemical problems with machine learning, molecular modeling, and quantum mechanics. “For the past five years, I’ve looked at how machine learning can help us solve otherwise unsolvable challenges in chemistry,” Isayev said.
  • Manuela Campanelli, Rochester Institute of Technology

    Cosmology. Manuela Campanelli professor of Astrophysics at the Rochester Institute of Technology and director for the Center for Computational Relativity and Gravitation explores the cataclysmic collision of neutron stars that produced gravitational waves detected in 2017 by the Laser Interferometer Gravitational-Wave Observatory (LIGO); the Europe-based Virgo detector; and some 70 ground- and space-based observatories.

  • Virus Infection. Peter Kasson, an associate professor of Molecular Physiology and Biomedical Engineering at the University of Virginia, studies the mechanisms of viral infection. “We have to combine experiments with computer models where we build a model of the virus, one atom at a time, and then simulate the mechanics of how the atoms interact,” said Kasson.
  • Turbulence Processes. Diego Donzis, an associate professor in the Department of Aerospace Engineering at Texas A&M University will use Frontera “to run some of the simulations that will allow us to answer some long-standing and new questions we have about the process of mixing in compressible flows.”

FIGHTING CANCER WITH DETAILED MODELS

Biros is working to build bio-physical models of brain tumor development that include more factors than ever before, and to train automated medical image processing systems to detect the extent of cancers beyond the main tumor growth, which must be removed during surgery to prevent the cancer from returning.

Results for a real tumor taken from the BraTS’18 TCIA dataset. The images show the tumor core (enhancing and necrotic tumor cells) indicated as a gray wireframe with the reconstructed initial condition (magenta volume) and parts of the patient brain geometry.

“We know that as tumors grow, they interact mechanically with the surrounding healthy brain tissue. One hypothesis is that quantifying this interaction may give clues on specific mutations that drive the cancer. Another hypothesis is that if we can figure out where exactly the tumor started this will also give us information on specific mutations,” said Biros.

Biros and his team are trying to train more complex models than have ever been created, containing parameters that capture how new blood vessels form, and how diverse types of cells within a tumor interact. Doing so means incorporating data from many patients.

“We can easily come up with models that have hundreds of parameters. But with these models, even to test out basic hypotheses, we need to conduct simulations on a big machine,” Biros said. “The algorithm and application development and training need a big resource capable of a quick turnaround. Without Frontera, and the support we have received from the TACC staff, it would be impossible.”

“You need state-of-the-art resources to do science,” he explained. “With Frontera, everything is integrated in the system — GPUs, CPUs, visualization, analysis, common file systems. That’s exceptional, especially at this scale.” Link to TACC article on this project.

SEARCHING FOR IMPROVED PHOTOVOLTAICS

Balasubramanian was among the early users of Frontera. Actively collaborating with experimentalists, he is working to develop efficient ways to create next generation flexible solar photovoltaics that can exceed the energy-producing potential of today’s devices

“Our work involves simulation of solvent evaporation processes found in a typical spin coating experiment,” he said. “In order to compare results from atomistic simulations with images produced during experiments, large-scale computations are required.”

His typical simulations contain over one hundred million superatoms (a cluster of atoms that exhibit some of the properties of elemental atoms), and replicate the physical movements and interactions among these superatoms. Alongside these large simulations, Balasubramanian also performs computations to optimize the design variables in order to improve specific properties.

“With some of our initial simulations on Frontera, we have been able to improve by a factor of four to five, in terms of computing speed,” he said. Whereas a simulation of 100,000 atoms and few million timesteps would be carried out at the rate of 100 timesteps per second on a normal supercomputer, on Frontera, Balasubramanian has achieved speeds of approximately 500 timesteps per second.

“Understanding the morphology of these large-scale simulations would help us correlate the structure, properties, and performance of organic photovoltaics,” he said. Link to TACC article on the project.

LEVERAGING ML FOR QUANTUM MOLECULAR MODELING

“For the past five years, I’ve looked at how machine learning can help us solve otherwise unsolvable challenges in chemistry,” Isayev said. He noted that to truly determine how a molecule will respond to cells in real world conditions — treating diseases but also potentially causing side-effects — often requires an understanding of the quantum mechanical behavior of many interacting atoms.

Students and postdocs in Isayev lab trained a neural network that can accurately describe the potential energy of molecules based on their three-dimensional structure. In a recent paper published in Nature Communications, his team and team of Adrian Roitberg from the University of Florida showed that by combining several tricks from machine learning, a system can learn coupled cluster theory — a “gold standard” quantum mechanical method used for describing many-body systems — and transfer this knowledge to a neural network.

“We’re using machine learning to accelerate quantum mechanics,” Olexandr explained. “We train a neural network to approximate the solution of Schrodinger equation, in our case solving density functional theory (DFT) equations for organic molecules first.”

The approach Isayev used is called transfer learning. It combines a large number of less-intensive DFT calculations that provide a rough approximation of the system behavior, with a subset of coupled cluster calculations that refine the details of the model.

“Instead of using 100 million CPU hours, you only use one percent of that amount and rely on cheaper methods,” Isayev explained. “We were able to achieve a nine order-of-magnitude speed up for certain applications using neural networks. Once the neural network is trained, you can run pretty accurate calculations, essentially on your laptop in a fraction of a second.” Link to TACC article on this project.

UNDERSTANDING CATACLYSMIC EVENTS IN THE UNIVERSE

“My research uses supercomputers to simulate very compact objects in the universe, such as black holes and neutron stars,” Campanelli explained. “These objects emit extremely powerful bursts of gravitational radiation, and in the case of neutron stars, they also emit very powerful bursts of electromagnetic signals. I work to simulate these events on supercomputers to predict what kind of signals they produce, and then pass these simulation results to our colleagues in astronomy so they know what they are looking for.”

Frontera has been allowing Campanelli to explore the cataclysmic collision of neutron stars that produced gravitational waves detected in 2017 by the Laser Interferometer Gravitational-Wave Observatory (LIGO); the Europe-based Virgo detector; and some 70 ground- and space-based observatories.

“We’re doing the most accurate and longest simulation ever of this collision to answer some of the key questions about what LIGO observed and what type of electromagnetic signals were emitted during this process,” she said.

In addition to exploring the specific neutron star collision, the project advances computational methods for understanding the dynamics of ejection, accretion, winds, and jets in neutron star mergers, work that is supported by a $1.5 million grant from NASA.

“These mergers expose the extremes of gravitational, electromagnetic and particle physics,” said Campanelli. “They are some of the greatest opportunities for multi-messenger science and the combined study of bursts of light spanning across the electromagnetic spectrum and powerful gravitational wave emissions.”  Link to TACC article on this project.

UNRAVELLING VIRUS INFECTION MECHANISMS

“We work to understand viral infections such as influenza and Zika,” Kasson said. “What we do guides the development of new antiviral therapies, and also helps us assess how well vaccines work and how well people’s immunity can prevent new viral threats from causing widespread disease in the United States.”

Kasson and his team observe viruses experimentally by tagging them with fluorescent proteins and using microscopy to understand how they affect cells. However, the experiments provide them with a very limited level of detail. Kasson relies on computer modeling in conjunction with experiments.

Their research uses experimental data to refine their simulations, and has the potential to serve as a test case for and also develop large-scale adaptive ensemble methods — programs that run many simulations, examine the results, and decide what to run next so that the process of deciding what simulations to do is automated as well as the simulations themselves.

Kasson leads one of the 34 research groups selected to participate in the Frontera early user period. “The initial experience has been extremely smooth. We’ve been able to get some exciting preliminary results that we’re very eager to run further,” Kasson said. “In the time we’ve been using Frontera, our simulations are proceeding two or three times faster than on the prior supercomputers we’ve had access to.” Link to TACC article on the project.

TURNING FRONTERA INTO A TURBULENCE BUSTER

Turbulence is so complicated that scientists today try to simplify it as much as possible but still retain the basic physics of it. One of the simplifications is to assume that the general motion of turbulence, its flow, is incompressible or of constant density. This simplification works as a good approximation of low speed flows, but it falls apart for high speed turbulent flows, which are important for a wide variety of applications and phenomena such as the mixing of fuel in combustion engines of cars, planes, and rockets.

Diego Donzis, Texas A&M

Donzis, an early user of the Frontera system, is no stranger to NSF supercomputers. He developed his group’s code, called Compressible Direct Numerical Simulations (cDNS) using several different systems – among them Stampede1, Stampede2 and now Frontera – and has successfully scaled cDNS up to a million cores on Department of Energy supercomputers Titan and Mira.

“On Frontera, we would like to run some of the simulations that will allow us to answer some long-standing and new questions we have about the process of mixing in compressible flows. Only recently, with computers reaching very high levels of parallelism, can we tackle problems in compressible turbulence at conditions that are relevant to applications,” Donzis said.

More computing power translates to added detail in computer models, which can solve more equations that capture the interactions between turbulence and temperature, pressure, and density — features not accounted for in incompressible flows.

“Frontera will be well-suited for us to run these simulations,” Donzis explained. “Mainly it’s the size of Frontera, which will make some of these unprecedented simulations possible. Also, something attractive to us is that it’s based on well-known architectures; well-known components. We can predict, we hope more or less accurately, how the code will behave, even at very large scales on Frontera. We believe that a full-scale, full machine run on Frontera will be very efficient.”Link to TACC article on the project.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion XL — were added to the benchmark suite as MLPerf continues Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing power it brings to artificial intelligence.  Nvidia's DGX Read more…

Call for Participation in Workshop on Potential NSF CISE Quantum Initiative

March 26, 2024

Editor’s Note: Next month there will be a workshop to discuss what a quantum initiative led by NSF’s Computer, Information Science and Engineering (CISE) directorate could entail. The details are posted below in a Ca Read more…

Waseda U. Researchers Reports New Quantum Algorithm for Speeding Optimization

March 25, 2024

Optimization problems cover a wide range of applications and are often cited as good candidates for quantum computing. However, the execution time for constrained combinatorial optimization applications on quantum device Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at the network layer threatens to make bigger and brawnier pro Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HBM3E memory as well as the the ability to train 1 trillion pa Read more…

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HB Read more…

Nvidia Looks to Accelerate GenAI Adoption with NIM

March 19, 2024

Today at the GPU Technology Conference, Nvidia launched a new offering aimed at helping customers quickly deploy their generative AI applications in a secure, s Read more…

The Generative AI Future Is Now, Nvidia’s Huang Says

March 19, 2024

We are in the early days of a transformative shift in how business gets done thanks to the advent of generative AI, according to Nvidia CEO and cofounder Jensen Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. Wh Read more…

Alibaba Shuts Down its Quantum Computing Effort

November 30, 2023

In case you missed it, China’s e-commerce giant Alibaba has shut down its quantum computing research effort. It’s not entirely clear what drove the change. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Leading Solution Providers

Contributors

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Google Introduces ‘Hypercomputer’ to Its AI Infrastructure

December 11, 2023

Google ran out of monikers to describe its new AI system released on December 7. Supercomputer perhaps wasn't an apt description, so it settled on Hypercomputer Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Intel Won’t Have a Xeon Max Chip with New Emerald Rapids CPU

December 14, 2023

As expected, Intel officially announced its 5th generation Xeon server chips codenamed Emerald Rapids at an event in New York City, where the focus was really o Read more…

IBM Quantum Summit: Two New QPUs, Upgraded Qiskit, 10-year Roadmap and More

December 4, 2023

IBM kicks off its annual Quantum Summit today and will announce a broad range of advances including its much-anticipated 1121-qubit Condor QPU, a smaller 133-qu Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire