At ISC, the Fight Against COVID-19 Took the Stage – and Yes, Fugaku Was There

By Oliver Peckham

June 23, 2020

With over nine million infected and nearly half a million dead, the COVID-19 pandemic has seized the world’s attention for several months. It has also dominated the supercomputing sector, with COVID-related research receiving major allocations on nearly every research supercomputer in the world (and many industrial supercomputers). It’s not surprising, then, that at ISC 2020, the virtual conference opened, revealed the new Top500 list – and then got straight to COVID-19.

In the focus session, three speakers addressed how HPC is fighting back against the coronavirus: Satoshi Matsuoka of RIKEN, which just nabbed the top spot in the Top500 with its Fugaku system; Peter Coveney of the Centre of Excellence in Computational Biomedicine, which is working to make HPC and machine learning actionable in a rapid drug development workflow; and Rick Stevens of Argonne National Laboratory, where researchers are working with the COVID-19 HPC Consortium to comb through billions of molecules.

Satoshi Matsuoka highlights Fugaku’s role in COVID-19 research

Satoshi Matsuoka

Fugaku, the most powerful supercomputer in the world, is in business early thanks to COVID-19. The system – situated at Riken in Japan – was scheduled to launch in 2021. When the pandemic struck, plans changed. “It was decided by [Japan’s] Ministry of Education, Culture, Sports, Science and Technology that we will utilize [not only supercomputers that are already available] but also [deploy Fugaku early], almost a year ahead of schedule, to combat COVID-19,” explained Matsuoka, director of the Riken Center for Computational Science (R-CCS).

The Fugaku supercomputer

Fugaku’s showstopping 415 Linpack petaflops are close to triple the performance of the runner-up, Oak Ridge’s newly dethroned Summit system. At 158,976 nodes, Fugaku is the largest system ever created in terms of nodes, footprint and power consumption. The software, Matsuoka said, is “quite standard,” allowing for broad usability without much Fugaku-specific tweaking.

“They’re largely divided into two areas,” Matsuoka said of Fugaku’s COVID-19 workloads. “One is medical-pharma – so trying to see how the virus behaves, what are the effective drugs, especially how we can repurpose existing drugs and so forth and also how a vaccine is made. So these are molecular-level investigations of the virus and its countermeasures. The other is more macroscopic – so we’re trying to see how these viruses are transmitted and what are the mitigation measures and how it will impact society.”

Matsuoka highlighted several of the COVID-19 projects taking advantage of Fugaku’s early arrival. One Riken researcher, for instance, is studying conformational changes of the spike protein using a highly scalable molecular dynamics code. Another researcher is using fragment molecular orbital calculations to investigate the energy levels of the spike protein, scaling across hundreds of thousands of Fugaku’s CPUs. “On [Fugaku’s predecessor] the K computer,” Matsuoka said, “this calculation would have taken days, weeks, multiple weeks – on Fugaku, … they have been able to do this in just three hours.”

Other researchers are using Fugaku to run socially oriented simulations, such as simulating droplets in indoor spaces like trains or simulating the spread effects of using face masks or contact tracing applications, Matsuoka said – and, of course, there are more to come. “So if you have any good ideas,” he said, “go to the website and you can apply.” 

A Riken-led simulation of virus droplets in train cabins. Image courtesy of Satoshi Matsuoka.

Peter Coveney describes a new, HPC- and AI-driven model for drug development

Peter Coveney

Coveney, the second speaker, runs the Centre of Excellence in Computational Biomedicine (CompBioMed), an initiative funded by the European Union that is currently redirecting its research efforts and computational research to the study of and drug development for COVID-19. Coveney (who also teaches at University College London) stressed the need to “invert the model [of drug development] as it currently exists” using advanced IT.

“The opportunities there are enormous,” Coveney said. “What we’re really trying to do is transform the approach to biomedicine, to be able to move it from a highly empirical approach … to putting a priority on the predictions that come out of computers.”

But to do that, he said, the computational results had to be actionably accurate – and perhaps even more difficult, they had to be quickly produced. Molecular screening, however – the crux of computational drug design, whereby compounds are fitted to targets on the virus’ proteins – is labor-intensive, time-consuming and expensive ($1 to $10 a compound, with billions of compounds to screen for COVID-19).

Coveney outlined how CompBioMed worked with over 40 partners around the world to streamline the computational drug design pipeline. CompBioMed gained access to a wide range of supercomputers, from SuperMUC-NG (the most powerful supercomputer in the EU) to Piz Daint, Archer, Summit, Frontera, Theta and more. The researchers used a piece of middleware called Radical Cybertools to run workflows across a large number of nodes on multiple machines.

With computing power in hand, CompBioMed focused on how to ensure “validation, verification and uncertainty quantification” (or “VVUQ”) in the pipeline. “This is designed in general to raise confidence in HPC simulation,” Coveney said.

To effectively leverage the computing power and ensure “VVUQ,” CompBioMed combined machine learning with molecular dynamics. Machine learning was used first to whittle down the near-infinite list of candidate molecules. “We have to do searches in a hurry,” Coveney said. “We want to use computationally very fast methods that are also cheap … to search huge libraries of molecules, to explore chemical space, to predict new molecules and so on.” 

The ensemble molecular dynamics process. Image courtesy of Peter Coveney.

Then, with the list whittled down, CompBioMed used molecular dynamics simulations – 20 to 30 of them at a time. As Coveney explained, a single molecular dynamics simulation could have a large number of errors. “But if you run many of them concurrently … we can run those on very large supercomputers all at the same time,” Coveney said. “Then we can make reliable predictions that get fed back to another stage of the machine learning.”

The best candidate compounds from this process are then submitted to medical research labs for further testing. “We are already discovering many tens to hundreds of potential compounds that can be investigated by our experimental colleagues,” Coveney said. “And indeed, that’s happening already.”

“We’re trying to change the way medicine is actually understood and applied,” Coveney concluded. “We want to make the subject more amenable to scientific investigation, that it should revolve around theory, modeling and simulation in addition to experimental research.”

Rick Stevens dives into the COVID-19 HPC Consortium and machine learning-enabled research

Rick Stevens

Finally, Stevens took the virtual stage. Stevens – associate laboratory director at Argonne National Laboratory – has been working closely with the COVID-19 HPC Consortium, a public-private effort to pool supercomputing resources for COVID-19 research. Currently, the effort has over 40 members, comprising some 483 petaflops of resources, 50,000 GPUs, 136,000 nodes and five million CPU cores. 

As Stevens explained, the projects being tackled by the consortium fall into three broad categories: first, basic science, including things like analyzing the virus’ structure, protein functions and virus evolution; second, therapeutics (“the largest group”), aiming to discovery drug targets on the virus, design drugs and discover repurposable drugs; and finally, patient care – “things more related to optimizing the healthcare system or epidemiology.”

Stevens outlined some of the key work, especially where it intersected with Argonne. “If you’re gonna work on this problem, you need to understand the enemy,” Stevens said, describing how Argonne has used its Advanced Photon Source (APS) to identify new structures of COVID-19, which in turn produce new drug targets for simulations to examine. 

Like Coveney, Stevens highlighted the intersections of AI and supercomputing as viable pathways for processing massive amounts of compounds in a relatively short time frame. For instance, he said, researchers were using AI to reconcile models of proteins from various sources to produce even more accurate models. In the spring, Argonne also began assembling a large database – around 60 TB – containing descriptors, images and more for over four billion compounds, with the aim of producing massive datasets for machine learning applications.

“One of the strategies that we have is to use a combination of high-throughput virtual docking … to generate scores – generate them on thousands or millions of data points,” Stevens said, “but then use that data to train machine learning models and do inference on a much larger scale.” As in Coveney’s research, the most promising hits are then sent for wet lab screening. 

Argonne’s pipeline for COVID-19 drug discovery. Image courtesy of Rick Stevens.

Stevens also discussed the use of machine learning to understand the “trajectories” of molecular dynamics simulations and the use of reinforcement learning to essentially build drug molecules from the ground up, adding to them iteratively to improve the docking score.

“One of the overall challenges here, of course, is that there’s over 10⁶⁰ possible drugs,” Steven said, “and you can only test at the end of the day, in humans, a small fraction of these.” But now, with AI and supercomputing converging to create a new model of rapid drug design, that might be enough.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

The Case for an Edge-Driven Future for Supercomputing

September 24, 2021

“Exascale only becomes valuable when it’s creating and using data that we care about,” said Pete Beckman, co-director of the Northwestern-Argonne Institute of Science and Engineering (NAISE), at the most recent HPC Read more…

Three Universities Team for NSF-Funded ‘ACES’ Reconfigurable Supercomputer Prototype

September 23, 2021

As Moore’s law slows, HPC developers are increasingly looking for speed gains in specialized code and specialized hardware – but this specialization, in turn, can make testing and deploying code trickier than ever. Now, researchers from Texas A&M University, the University of Illinois at Urbana... Read more…

Qubit Stream: Monte Carlo Advance, Infosys Joins the Fray, D-Wave Meeting Plans, and More

September 23, 2021

It seems the stream of quantum computing reports never ceases. This week – IonQ and Goldman Sachs tackle Monte Carlo on quantum hardware, Cambridge Quantum pushes chemistry calculations forward, D-Wave prepares for its Read more…

Asetek Announces It Is Exiting HPC to Protect Future Profitability

September 22, 2021

Liquid cooling specialist Asetek, well-known in HPC circles for its direct-to-chip cooling technology that is inside some of the fastest supercomputers in the world, announced today that it is exiting the HPC space amid multiple supply chain issues related to the pandemic. Although pandemic supply chain... Read more…

TACC Supercomputer Delves Into Protein Interactions

September 22, 2021

Adenosine triphosphate (ATP) is a compound used to funnel energy from mitochondria to other parts of the cell, enabling energy-driven functions like muscle contractions. For ATP to flow, though, the interaction between the hexokinase-II (HKII) enzyme and the proteins found in a specific channel on the mitochondria’s outer membrane. Now, simulations conducted on supercomputers at the Texas Advanced Computing Center (TACC) have simulated... Read more…

AWS Solution Channel

Introducing AWS ParallelCluster 3

Running HPC workloads, like computational fluid dynamics (CFD), molecular dynamics, or weather forecasting typically involves a lot of moving parts. You need a hundreds or thousands of compute cores, a job scheduler for keeping them fed, a shared file system that’s tuned for throughput or IOPS (or both), loads of libraries, a fast network, and a head node to make sense of all this. Read more…

The Latest MLPerf Inference Results: Nvidia GPUs Hold Sway but Here Come CPUs and Intel

September 22, 2021

The latest round of MLPerf inference benchmark (v 1.1) results was released today and Nvidia again dominated, sweeping the top spots in the closed (apples-to-apples) datacenter and edge categories. Perhaps more interesti Read more…

The Case for an Edge-Driven Future for Supercomputing

September 24, 2021

“Exascale only becomes valuable when it’s creating and using data that we care about,” said Pete Beckman, co-director of the Northwestern-Argonne Institut Read more…

Three Universities Team for NSF-Funded ‘ACES’ Reconfigurable Supercomputer Prototype

September 23, 2021

As Moore’s law slows, HPC developers are increasingly looking for speed gains in specialized code and specialized hardware – but this specialization, in turn, can make testing and deploying code trickier than ever. Now, researchers from Texas A&M University, the University of Illinois at Urbana... Read more…

Qubit Stream: Monte Carlo Advance, Infosys Joins the Fray, D-Wave Meeting Plans, and More

September 23, 2021

It seems the stream of quantum computing reports never ceases. This week – IonQ and Goldman Sachs tackle Monte Carlo on quantum hardware, Cambridge Quantum pu Read more…

Asetek Announces It Is Exiting HPC to Protect Future Profitability

September 22, 2021

Liquid cooling specialist Asetek, well-known in HPC circles for its direct-to-chip cooling technology that is inside some of the fastest supercomputers in the world, announced today that it is exiting the HPC space amid multiple supply chain issues related to the pandemic. Although pandemic supply chain... Read more…

TACC Supercomputer Delves Into Protein Interactions

September 22, 2021

Adenosine triphosphate (ATP) is a compound used to funnel energy from mitochondria to other parts of the cell, enabling energy-driven functions like muscle contractions. For ATP to flow, though, the interaction between the hexokinase-II (HKII) enzyme and the proteins found in a specific channel on the mitochondria’s outer membrane. Now, simulations conducted on supercomputers at the Texas Advanced Computing Center (TACC) have simulated... Read more…

The Latest MLPerf Inference Results: Nvidia GPUs Hold Sway but Here Come CPUs and Intel

September 22, 2021

The latest round of MLPerf inference benchmark (v 1.1) results was released today and Nvidia again dominated, sweeping the top spots in the closed (apples-to-ap Read more…

Why HPC Storage Matters More Now Than Ever: Analyst Q&A

September 17, 2021

With soaring data volumes and insatiable computing driving nearly every facet of economic, social and scientific progress, data storage is seizing the spotlight. Hyperion Research analyst and noted storage expert Mark Nossokoff looks at key storage trends in the context of the evolving HPC (and AI) landscape... Read more…

GigaIO Gets $14.7M in Series B Funding to Expand Its Composable Fabric Technology to Customers

September 16, 2021

Just before the COVID-19 pandemic began in March 2020, GigaIO introduced its Universal Composable Fabric technology, which allows enterprises to bring together Read more…

Ahead of ‘Dojo,’ Tesla Reveals Its Massive Precursor Supercomputer

June 22, 2021

In spring 2019, Tesla made cryptic reference to a project called Dojo, a “super-powerful training computer” for video data processing. Then, in summer 2020, Tesla CEO Elon Musk tweeted: “Tesla is developing a [neural network] training computer called Dojo to process truly vast amounts of video data. It’s a beast! … A truly useful exaflop at de facto FP32.” Read more…

Enter Dojo: Tesla Reveals Design for Modular Supercomputer & D1 Chip

August 20, 2021

Two months ago, Tesla revealed a massive GPU cluster that it said was “roughly the number five supercomputer in the world,” and which was just a precursor to Tesla’s real supercomputing moonshot: the long-rumored, little-detailed Dojo system. “We’ve been scaling our neural network training compute dramatically over the last few years,” said Milan Kovac, Tesla’s director of autopilot engineering. Read more…

Esperanto, Silicon in Hand, Champions the Efficiency of Its 1,092-Core RISC-V Chip

August 27, 2021

Esperanto Technologies made waves last December when it announced ET-SoC-1, a new RISC-V-based chip aimed at machine learning that packed nearly 1,100 cores onto a package small enough to fit six times over on a single PCIe card. Now, Esperanto is back, silicon in-hand and taking aim... Read more…

CentOS Replacement Rocky Linux Is Now in GA and Under Independent Control

June 21, 2021

The Rocky Enterprise Software Foundation (RESF) is announcing the general availability of Rocky Linux, release 8.4, designed as a drop-in replacement for the soon-to-be discontinued CentOS. The GA release is launching six-and-a-half months after Red Hat deprecated its support for the widely popular, free CentOS server operating system. The Rocky Linux development effort... Read more…

Intel Completes LLVM Adoption; Will End Updates to Classic C/C++ Compilers in Future

August 10, 2021

Intel reported in a blog this week that its adoption of the open source LLVM architecture for Intel’s C/C++ compiler is complete. The transition is part of In Read more…

Hot Chips: Here Come the DPUs and IPUs from Arm, Nvidia and Intel

August 25, 2021

The emergence of data processing units (DPU) and infrastructure processing units (IPU) as potentially important pieces in cloud and datacenter architectures was Read more…

AMD-Xilinx Deal Gains UK, EU Approvals — China’s Decision Still Pending

July 1, 2021

AMD’s planned acquisition of FPGA maker Xilinx is now in the hands of Chinese regulators after needed antitrust approvals for the $35 billion deal were receiv Read more…

Google Launches TPU v4 AI Chips

May 20, 2021

Google CEO Sundar Pichai spoke for only one minute and 42 seconds about the company’s latest TPU v4 Tensor Processing Units during his keynote at the Google I Read more…

Leading Solution Providers

Contributors

HPE Wins $2B GreenLake HPC-as-a-Service Deal with NSA

September 1, 2021

In the heated, oft-contentious, government IT space, HPE has won a massive $2 billion contract to provide HPC and AI services to the United States’ National Security Agency (NSA). Following on the heels of the now-canceled $10 billion JEDI contract (reissued as JWCC) and a $10 billion... Read more…

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

Quantum Roundup: IBM, Rigetti, Phasecraft, Oxford QC, China, and More

July 13, 2021

IBM yesterday announced a proof for a quantum ML algorithm. A week ago, it unveiled a new topology for its quantum processors. Last Friday, the Technical Univer Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

Frontier to Meet 20MW Exascale Power Target Set by DARPA in 2008

July 14, 2021

After more than a decade of planning, the United States’ first exascale computer, Frontier, is set to arrive at Oak Ridge National Laboratory (ORNL) later this year. Crossing this “1,000x” horizon required overcoming four major challenges: power demand, reliability, extreme parallelism and data movement. Read more…

Intel Unveils New Node Names; Sapphire Rapids Is Now an ‘Intel 7’ CPU

July 27, 2021

What's a preeminent chip company to do when its process node technology lags the competition by (roughly) one generation, but outmoded naming conventions make it seem like it's two nodes behind? For Intel, the response was to change how it refers to its nodes with the aim of better reflecting its positioning within the leadership semiconductor manufacturing space. Intel revealed its new node nomenclature, and... Read more…

Latest MLPerf Results: Nvidia Shines but Intel, Graphcore, Google Increase Their Presence

June 30, 2021

While Nvidia (again) dominated the latest round of MLPerf training benchmark results, the range of participants expanded. Notably, Google’s forthcoming TPU v4 Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire