Inside the Gordon Bell Prize Finalist Projects

By Oliver Peckham

September 7, 2022

The ACM Gordon Bell Prize, which comes with a $10,000 award courtesy of HPC luminary Gordon Bell, is widely considered the highest prize in high-performance computing. Each year, six finalists are selected who represent the pinnacle of outstanding research achievements in HPC. Last month, listings on the SC22 schedule revealed those finalists. Over the last few weeks, HPCwire got in touch with members of the six finalist teams to learn more about their projects.

Last year, for the first time, the Gordon Bell Prize nominees included two projects powered by exascale computing — specifically, China’s “new Sunway supercomputer,” also known as OceanLight. These research papers, at the time, constituted the most substantively “official” reveal of the system (which remains unranked). One of those OceanLight-powered papers — a challenge to Google’s quantum supremacy claim — won that year’s Gordon Bell Prize.

In 2022, OceanLight has exascale-caliber competition: not one but two of the other five finalist projects used the new American exascale supercomputer, Frontier, which launched earlier this year at Oak Ridge National Lab (ORNL). And, beyond OceanLight and Frontier, previous Top500-toppers Fugaku (RIKEN) and Summit (ORNL) both return to the list under multiple finalist teams, along with Perlmutter (at NERSC, the National Energy Research Scientific Computing Center) and Shaheen-2 (at KAUST, the King Abdullah University of Science and Technology).

And now: the finalist projects.

Using OceanLight to simulate millions of atoms

This year sees OceanLight return to the stage as the sole supercomputer behind a paper titled 2.5 Million-Atom Ab Initio Electronic-Structure Simulation of Complex Metallic Heterostructures with DGDFT — a project involving simulations of millions of atoms that made use of tens of millions of cores on OceanLight.


Abstract: Over the past three decades, ab initio electronic structure calculations of large, complex and metallic systems are limited to tens of thousands of atoms in both numerical accuracy and computational efficiency on leadership supercomputers. We present a massively parallel discontinuous Galerkin density functional theory (DGDFT) implementation, which adopts adaptive local basis functions to discretize the Kohn-Sham equation, resulting in a block-sparse Hamiltonian matrix. A highly efficient pole expansion and selected inversion (PEXSI) sparse direct solver is implemented in DGDFT to achieve O(^1.5) scaling for quasi two-dimensional systems. DGDFT allows us to compute the electronic structures of complex metallic heterostructures with 2.5 million atoms (17.2 million electrons) using 35.9 million cores on the new Sunway supercomputer. In particular, the peak performance of PEXSI can achieve 64 PFLOPS (∼5 percent of theoretical peak), which is unprecedented for sparse direct solvers. This accomplishment paves the way for quantum mechanical simulations into mesoscopic scale for designing next-generation energy materials and electronic devices.

Per the SC22 schedule, this team includes researchers from the Chinese Academy of Sciences, Peking University, the Pilot National Laboratory for Marine Science and Technology, the National Research Center of Parallel Computer Engineering and Technology, the Qilo University of Technology and the University of Science and Technology of China.


“Our team is highly excited [to be] nominated for the Gordon Bell Prize finalists as we started preparation for this work since last year,” said Qingcai Jiang, a researcher at the University of Science and Technology of China (USTC), in an email to HPCwire. “Our work for the first time achieves plane-wave precision electronic structure calculation for large-scale complex metallic heterostructures containing 2.5 million atoms (17.2 million electrons), and our optimization techniques make our work able to achieve peak performance of 64 PFLOPS (∼5 percent of theoretical peak), which is unprecedented for sparse direct solvers.”

Frontier powers biomedical literature analytics

The first of projects powered by Frontier, titled ExaFlops Biomedical Knowledge Graph Analytics, also made use of ORNL’s previous chart-topper, Summit, and focuses on large-scale mining of biomedical research literature.

The Frontier supercomputer, one of six represented in the Gordon Bell Prize finalists.

Abstract: We are motivated by newly proposed methods for mining large-scale corpora of scholarly publications (e.g., full biomedical literature), which consists of tens of millions of papers spanning decades of research. In this setting, analysts seek to discover relationships among concepts. They construct graph representations from annotated text databases and then formulate the relationship-mining problem as an all-pairs shortest paths (APSP) and validate connective paths against curated biomedical knowledge graphs (e.g., SPOKE). In this context, we present COAST (Exascale Communication-Optimized All-Pairs Shortest Path) and demonstrate 1.004 EF/s on 9,200 Frontier nodes (73,600 GCDs). We develop hyperbolic performance models (HYPERMOD), which guide optimizations and parametric tuning. The proposed COAST algorithm achieved the memory constant parallel efficiency of 99 percent in the single-precision tropical semiring. Looking forward, COAST will enable the integration of scholarly corpora like PubMed into the SPOKE biomedical knowledge graph.

Per the SC22 schedule, this team includes researchers from AMD, the Georgia Institute of Technology, ORNL and the University of California, San Francisco.


“The ability to establish paths between any pair of biomedical concepts with the richness of PubMed in a reasonable time has the potential to revolutionize biomedical research and apply national research funds more effectively,” said Ramakrishnan Kannan, group leader for discrete algorithms at ORNL, in an email to HPCwire. “The comparison of knowledge encoded within SPOKE, which is largely human-curated, against concept relationships that might be mined automatically from a scholarly database like PubMed will result in faster and automated integration of biomedical information at scale.”

According to the team, this project is “the first exascale graph AI demonstration” to run at over one exaflops. “This first demonstration of exascale computation speed will transform the way we currently conduct search in complex heterogeneous knowledge graphs like SPOKE,” the research team told HPCwire. “Specifically, it will enable a new class of algorithms to be implemented in graphs of unprecedented size and complexity. This will greatly improve the quality of biomedical research inquiry, and accelerate the time to patient diagnosis and care like never before.”

Four top-ten supercomputers enable plasma simulations

The second project to use Frontier: Pushing the Frontier in the Design of Laser-Based Electron Accelerators with Groundbreaking Mesh-Refined Particle-In-Cell Simulations on Exascale-Class Supercomputers. Though the title of the paper — which revolved around kinetic plasma simulations — winks at its use of Frontier, the team actually used four supercomputers: Frontier, Fugaku (RIKEN), Summit and Perlmutter (NERSC), meaning that this one paper used four of the top seven supercomputers on the most recent Top500 list. In an email to HPCwire, Jean-Luc Vay — a senior scientist at Lawrence Berkeley National Lab — outlined the science runs of the research, which were conducted on Frontier (up to 8,192 nodes), Fugaku (up to ~93,000 nodes) and Summit (up to 4,096 nodes).

The Perlmutter supercomputer.

Abstract: We present a first-of-kind mesh-refined (MR) massively parallel Particle-In-Cell (PIC) code for kinetic plasma simulations optimized on the Frontier, Fugaku, Summit, and Perlmutter supercomputers. Major innovations, implemented in the WarpX PIC code, include: (i) a three level parallelization strategy that demonstrated performance portability and scaling on millions of A64FX cores and tens of thousands of AMD and Nvidia GPUs (ii) a groundbreaking mesh refinement capability that provides between 1.5x to 4x savings in computing requirements on the science case reported in this paper, (iii) an efficient load balancing strategy between multiple MR levels. The MR PIC code enabled 3D simulations of laser-matter interactions on Frontier, Fugaku, and Summit, which have so far been out of the reach of standard codes. These simulations helped remove a major limitation of compact laser-based electron accelerators, which are promising candidates for next generation high-energy physics experiments and ultra-high dose rate FLASH radiotherapy.

Per the SC22 schedule, this team includes researchers from Arm, Atos, CEA-Université Paris-Saclay, ENSTA Paris, GENCI, Lawrence Berkeley National Lab and RIKEN.


“Plasma accelerator technologies have the potential to provide particle accelerators that are much more compact than existing ones, opening the door to exciting novel applications in science, industry, security and health,” Vay explained. “Exploiting the most powerful supercomputers in the world to boost the research to make these complex machines a reality is so stimulating to all of us.”

“It is thrilling for the entire team to be selected as finalist of the Gordon Bell Prize, even for the one of us (Axel Huebl), for whom it is ‘déjà vu’ as he was already a finalist in 2013 with another (PIConGPU) team,” Vay added. “It is the vindication of years of hard work from the U.S. DOE Exascale Computing Project participants and longstanding collaborators from CEA Saclay in France, coupled to the more recent hard work with colleagues from various labs and private companies in France (Genci, Arm, Atos) and RIKEN in Japan.”

Geostatistics get a boost from Shaheen-2 and Fugaku

The exascale-enabled research only constitutes half the list. Another finalist paper — Reshaping Geostatistical Modeling and Prediction for Extreme-Scale Environmental Applications — used Shaheen-2 as well as Fugaku.

The Shaheen-2 supercomputer.

Abstract: We extend the capability of space-time geostatistical modeling using algebraic approximations, illustrating application-expected accuracy worthy of double precision from majority low-precision computations and low-rank matrix approximations. We exploit the mathematical structure of the dense covariance matrix whose inverse action and determinant are repeatedly required in Gaussian log-likelihood optimization. Geostatistics augments first-principles modeling approaches for the prediction of environmental phenomena given the availability of measurements at a large number of locations; however, traditional Cholesky-based approaches grow cubically in complexity, gating practical extension to continental and global datasets now available. We combine the linear algebraic contributions of mixed-precision and low-rank computations within a tilebased Cholesky solver with on-demand casting of precisions and dynamic runtime support from PaRSEC to orchestrate tasks and data movement. Our adaptive approach scales on various systems and leverages the Fujitsu A64FX nodes of Fugaku to achieve upto 12X performance speedup against the highly optimized dense Cholesky implementation.

Per the SC22 schedule, this team includes researchers from KAUST, ORNL and the University of Tennessee. Perhaps notably, the team also includes Jack Dongarra, one of SC22’s keynote speakers.


“For our exploratory science runs, and to demonstrate the acceptable accuracy of our algorithmic variations on Cholesky factorization and further manipulation of massive covariance matrices, we used Shaheen-2 at KAUST,” explained David Keyes, director of the Extreme Computing Research Center at KAUST, in an email to HPCwire. “Shaheen-2 has only 6,192 nodes, so we applied to use Fugaku at RIKEN to scale further and were generously considered by RIKEN. Fugaku has 158,976 nodes, about 25 times more than Shaheen-2, and each node has 48 cores, 1.5 times more than a Shaheen-2 node. However, each Fugaku node is equipped with only 32GB of memory, one-quarter as much as Shaheen-2’s 128GB per node, thus only one-sixth as much per core, which required us to make software adaptations.”

“Entering the Gordon Bell competition was exciting for all of the team members, especially the students and postdocs,” Keyes said. “It provided an opportunity to run on the world’s second ranked computer. The required algorithmic adaptations to architecture led to improvements in our tools that will be useful at all scales. More importantly, the nomination created excitement with the statistics community since 2022 appears to be the first time after 35 years of the prize that any significant spatial statistics computation, environmental or otherwise, has thus advanced.”

Simulating earthquakes with Fugaku

The final of Fugaku’s three appearances among the finalist list comes courtesy of Extreme Scale Earthquake Simulation with Uncertainty Quantification, which used the second-ranked system to advance scientific understanding of earthquakes and fields with similar dynamics.

The Fugaku supercomputer.

Abstract: We develop a stochastic finite element method with ultra-large degrees of freedom that discretize probabilistic and physical spaces using unstructured second-order tetrahedral elements with double precision using a mixed-precision implicit iterative solver that scales to the full Fugaku system and enables fast Uncertainty Quantification (UQ). The developed solver designed to attain high performance on a variety of CPU/GPU-based supercomputers enabled solving 37 trillion degrees-of-freedom problem with 19.8 percent peak FP64 performance on full Fugaku (89.8 PFLOPS) with 87.7 percent weak scaling efficiency, corresponding to 224-fold speedup over the state of the art solver running on full Summit. This method, which has shown its effectiveness via solving huge (32-trillion degrees-of-freedom) practical problems, is expected to be a breakthrough in damage mitigation, and is expected to facilitate the scientific understanding of earthquake phenomena and have a ripple effect on other fields that similarly require UQ.

Per the SC22 schedule, this team includes researchers from Fujitsu, the Japan Agency for Marine-Earth Science and Technology, RIKEN and the University of Tokyo.


“We are very happy to be selected as finalists,” wrote Tsuyoshi Ichimura, a professor with the Earthquake Research Institute at the University of Tokyo, in an email to HPCwire. “We believe that this has a great impact in showing that capability computing can contribute to an unprecedented Uncertainty Quantification (UQ).”

Leveraging Summit to search proteins

Last, but certainly not least: Extreme-Scale Many-against-Many Protein Similarity Search, which used the Summit supercomputer to perform protein similarity calculations across hundreds of millions of proteins in just a few hours.

The Summit supercomputer.

Abstract: Similarity search is one of the most fundamental computations that are regularly performed on ever-increasing protein datasets. Scalability is of paramount importance for uncovering novel phenomena that occur at very large scales. We unleash the power of over 20,000 GPUs on the Summit system to perform all-vs-all protein similarity search on one of the largest publicly available datasets with 405 million proteins, in less than 3.5 hours, cutting the time-to-solution for many use cases from weeks. The variability of protein sequence lengths, as well as the sparsity of the space of pairwise comparisons, make this a challenging problem in distributed memory. Due to the need to construct and maintain a data structure holding indices to all other sequences, this application has a huge memory footprint that makes it hard to scale the problem sizes. We overcome this memory limitation by innovative matrix-based blocking techniques, without introducing additional load imbalance.

Per the SC22 schedule, this team includes researchers from Indiana University, the Institute for Fundamental Biomedical Research, the Department of Energy’s Joint Genome Institute, Lawrence Berkeley National Lab, Microsoft, NERSC and the University of California, Berkeley.


In an email to HPCwire, the team stressed the importance of this research area to critical fields. “Many-against-many sequence search is the backbone of biological sequence analysis used in drug discovery, healthcare, bioenergy, and environmental studies,” they wrote. “Our work is perhaps the first [Gordon Bell] finalist for a biological sequence analysis problem, which is surprising because sequence analysis is a perfect supercomputing application due to its data and compute intensive nature.”

“Our pipeline, PASTIS, performs a novel application of sparse matrices to narrow down the search space and to avoid quadratic number of sequence comparisons. Sparse matrix computations are much harder to map efficiently to modern supercomputing hardware, especially to GPU-equipped supercomputers such as the Summit system we have used in this work. Our approach cuts back the turnaround time from days to minutes in discovering similar sequences in huge protein datasets to complete the subsequent analytical steps in bioinformatics and allow for exploratory analysis of data sets under different parameter settings.”

What’s next

That’s all of them. For those keeping score at home: three finalist teams used Fugaku; three used Summit; two used Frontier; and OceanLight, Perlmutter and Shaheen-2 were each used by one finalist team. We’re still watching for the reveal of the finalists for the Gordon Bell Special Prize for High Performance Computing-Based Covid-19 Research, which will be awarded for the third time at SC22. At SC22 itself — set to be held in Dallas from November 13-18 — the finalists for both Gordon Bell Prizes will present their research ahead of the award ceremony.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

TACC Supercomputing Powers Climate Modeling for Fisheries

January 28, 2023

A tremendous portion of the world depends on the output of the oceans’ major fisheries, which have, in recent decades, found themselves under near-constant threat from mismanagement (e.g. overfishing). Climate change, Read more…

PFAS Regulations, 3M Exit to Impact Two-Phase Cooling in HPC

January 27, 2023

Per- and polyfluoroalkyl substances (PFAS), known as “forever chemicals,” pose a number of health risks to humans, with more suspected but not yet confirmed – and, as a result, PFAS are coming under increasing regu Read more…

Sweden Plans Expansion for Nvidia-Powered Berzelius Supercomputer

January 26, 2023

The Atos-built, Nvidia SuperPod-based Berzelius supercomputer – housed in and operated by Sweden’s Linköping-based National Supercomputer Centre (NSC) – is already no slouch. But now, Nvidia and NSC have announced Read more…

Multiverse, Pasqal, and Crédit Agricole Tout Progress Using Quantum Computing in FS

January 26, 2023

Europe-based quantum computing pioneers Multiverse Computing and Pasqal, and global bank Crédit Agricole CIB today announced successful conclusion of a 1.5-year POC study “to evaluate the contribution of an algorithmi Read more…

Critics Don’t Want Politicians Deciding the Future of Semiconductors

January 26, 2023

The future of the semiconductor industry was partially being decided last week by a mix of politicians, policy hawks and chip industry executives jockeying for influence at the World Economic Forum. Intel CEO Pat Gels Read more…

AWS Solution Channel

Shutterstock_1687123447

Numerix Scales HPC Workloads for Price and Risk Modeling Using AWS Batch

  • 180x improvement in analytics performance
  • Enhanced risk management
  • Decreased bottlenecks in analytics
  • Unlocked near-real-time analytics
  • Scaled financial analytics

Overview

Numerix, a financial technology company, needed to find a way to scale its high performance computing (HPC) solution as client portfolios ballooned in size. Read more…

Microsoft/NVIDIA Solution Channel

Shutterstock 1453953692

Microsoft and NVIDIA Experts Talk AI Infrastructure

As AI emerges as a crucial tool in so many sectors, it’s clear that the need for optimized AI infrastructure is growing. Going beyond just GPU-based clusters, cloud infrastructure that provides low-latency, high-bandwidth interconnects and high-performance storage can help organizations handle AI workloads more efficiently and produce faster results. Read more…

Riken Plans ‘Virtual Fugaku’ on AWS

January 26, 2023

The development of a national flagship supercomputer aimed at exascale computing continues to be a heated competition, especially in the United States, the European Union, China, and Japan. What is the value to be gained Read more…

PFAS Regulations, 3M Exit to Impact Two-Phase Cooling in HPC

January 27, 2023

Per- and polyfluoroalkyl substances (PFAS), known as “forever chemicals,” pose a number of health risks to humans, with more suspected but not yet confirmed Read more…

Critics Don’t Want Politicians Deciding the Future of Semiconductors

January 26, 2023

The future of the semiconductor industry was partially being decided last week by a mix of politicians, policy hawks and chip industry executives jockeying for Read more…

Riken Plans ‘Virtual Fugaku’ on AWS

January 26, 2023

The development of a national flagship supercomputer aimed at exascale computing continues to be a heated competition, especially in the United States, the Euro Read more…

Shutterstock 1134313550

Semiconductor Companies Create Building Block for Chiplet Design

January 24, 2023

Intel's CEO Pat Gelsinger last week made a grand proclamation that chips will be for the next few decades what oil and gas was to the world over the last 50 years. While that remains to be seen, two technology associations are joining hands to develop building blocks to stabilize the development of future chip designs. The goal of the standard is to set the stage for a thriving marketplace that fuels... Read more…

Royalty-free stock photo ID: 1572060865

Fujitsu Study Says Quantum Decryption Threat Still Distant

January 23, 2023

Global computer and chip manufacturer Fujitsu today reported that a new study performed on its 39-qubit quantum simulator suggests it will remain difficult for Read more…

At ORNL, Jeff Smith Becomes Interim Director, as Search for Permanent Lab Chief Continues

January 20, 2023

UT-Battelle, which manages Oak Ridge National Laboratory (ORNL) for the U.S. Department of Energy, has appointed Jeff Smith as interim director for the lab as t Read more…

Top HPC Players Creating New Security Architecture Amid Neglect

January 20, 2023

Security of high-performance computers is being neglected in the pursuit of horsepower, and there are concerns that the ignorance may be costly if safeguards ar Read more…

Ohio Supercomputer Center Debuts ‘Ascend’ GPU Cluster

January 19, 2023

Less than 10 months after it was announced, the Columbus-based Ohio Supercomputer Center (OSC) has debuted its Dell-built GPU cluster, “Ascend.” Designed to Read more…

Leading Solution Providers

Contributors

SC22 Booth Videos

AMD @ SC22
Altair @ SC22
AWS @ SC22
Ayar Labs @ SC22
CoolIT @ SC22
Cornelis Networks @ SC22
DDN @ SC22
Dell Technologies @ SC22
HPE @ SC22
Intel @ SC22
Intelligent Light @ SC22
Lancium @ SC22
Lenovo @ SC22
Microsoft and NVIDIA @ SC22
One Stop Systems @ SC22
Penguin Solutions @ SC22
QCT @ SC22
Supermicro @ SC22
Tuxera @ SC22
Tyan Computer @ SC22
  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire