In this regular feature, HPCwire highlights newly published research in the high-performance computing community and related domains. From parallel programming to exascale to quantum computing, the details are here.
The authors of this preprint paper propose that “leadership-class computing resources can be used to perform genome-scale protein structure prediction using state-of-the-art deep learning models, providing a wealth of new data for systems biology applications.” The authors go on to describe their efforts “to efficiently deploy the AlphaFold2 program, for full-proteome structure prediction, at scale on the Oak Ridge Leadership Computing Facility’s resources, including the Summit supercomputer.” The inferencing workload used nearly 4,000 total Summit node hours. Deployed in 2018, Summit spans 4,608 GPU-accelerated nodes and currently ranks number two on the Top500 list with 148.6 Linpack petaflops.
Authors: Mu Gao, Mark Coletti, Russell B. Davidson, Ryan Prout, Subil Abraham, Benjamin Hernandez and Ada Sedova
A team of Chinese researchers observes that the “computational complexity of obtaining the wave functions for accurately describing the quantum states increases exponentially with respect to particle number.” Addressing this challenge, they go on to present a “novel convolutional neural network for simulating the two-dimensional highly frustrated spin-1/2 J1-J2 Heisenberg model, [such that] the simulation is performed at an extreme scale system with low cost and high scalability.” With this research, the authors demonstrated the effectiveness of CNN-based representation of quantum state. Their computation harnessed 31 million cores of the new Sunway supercomputer, reported to be an exascale-class system with ~42 million SW26010Pro cores. The authors state they believe the application should be able to scale across the entire system.
Authors: Mingfan Li, Junshi Chen, Qian Xiao, Fei Wang, Qingcai Jiang, Xuncheng Zhao, Rongfen Lin, Hong An, Xiao Liang and Lixin He
In this study, the authors from the department of electronics and microelectronics at the University of Mons in Belgium propose “analytical modeling for architecture and application behavior that can be used to estimate energy-optimal software configurations and provide knowledgeable hints to improve DVFS and DPM techniques for single-node high-performance computing applications.” Their results show that up to 70 percent of energy could be saved (in best-case scenarios) compared to the default Linux choice, with an average of 14 percent energy saved.
Authors: Vitor Ramos Gomes da Silva, Carlos Valderrama, Pierre Manneback and Samuel Xavier-de-Souza
In this paper, “Verified Tensor-program optimization via high-level scheduling rewrites,” the authors present a new programming language for high performance computers that addresses both speed and accuracy. They’ve developed “a lightweight Coq framework for optimizing tensor kernels written in a pure, functional array language.” In their paper, they “demonstrate that not only is this system capable of deriving the optimizations of existing state-of-the-art languages like Halide and generating comparably performant code, it is also able to schedule a family of useful program transformations beyond what is reachable in Halide.”
Authors: Manda Liu, Gilbert Louis Berntein, Adam Chlipala and Jonathan Ragan-Kelley
Authored by two researchers from the Russian Academy of Sciences, this paper presents a “technology for scale-resolving simulations of turbulent flows in the problems of aerodynamics and aeroacoustics,” targeting a range of HPC platforms – from small clusters to exascale computers. The paper summarizes the advantages of a hybrid modeling method that combines Reynolds-averaged Navier – Stokes (RANS) and Large eddy simulation (LES) methods, which the authors state as being “widely recognized as the most efficient ones in terms of cost/accuracy ratio in many computational aerodynamics and aeroacoustics applications.” Other key technologies include “a numerical scheme for discretization in space, a parallel algorithm, and a portable software implementation for modern hybrid systems with extra massive parallelism.” With parallel efficiency gains across exaflops supercomputers, the authors suggest it will be possible to conduct previously intractable mesh problems, for example modeling an entire aircraft.
Authors: Andrey V. Gorobets and Alexey P. Duben
A team of researchers from across a number of Department of Energy laboratories investigate the performance of the Energy Exascale Earth System Model-MMF (E3SM-MMF) code on the Oak Ridge Leadership Computing Facility’s Summit supercomputer. “Hundreds of kernels in the roughly 10,000 lines of code in the E3SM-MMF CRM were ported to GPUs with OpenACC directives,” note the authors. “A high-resolution benchmark using 4,600 nodes on Summit demonstrates the computational capability of the GPU-enabled E3SM-MMF code in a full physics climate simulation,” they write. The research marks an important advance in incorporating key cloud effects into the climate model.
Authors: Matthew R. Norman, David A. Bader, Christopher Eldred, Walter M. Hannah, Benjamin R. Hillman, Christopher R. Jones, Jungmin M. Lee, L.R. Leung, Isaac Lyngaas, Kyle G. Pressel, Sarat Sreepathi, Mark A. Taylor and Xingqiu Yuan
The authors of this paper demonstrate how to conduct accelerated, artificial intelligence-driven gravitational wave detection at scale. The authors, using the ThetaGPU supercomputer at Argonne Leadership Computing Facility, noted their “Nvidia TensorRT-optimized AI ensemble processed an entire month of advanced LIGO data (including Hanford and Livingston data streams) within 50 seconds.” This, they say, was a threefold speedup that was achieved while retaining the same sensitivity as traditional AI models.
Authors: Pranshu Chaturvedi, Asad Khan, Minyang Tian, E. A. Huerta and Huihuo Zheng
Do you know about research that should be included in next month’s list? If so, send us an email at [email protected]. We look forward to hearing from you.