Increasingly, SARS-CoV-2 is engaged in a vicious game of a whack-a-mole with researchers, with new variants like Delta and Omicron wrong-footing medical professionals by reducing—or even eliminating—the efficacy of vaccines and therapeutics. Now, researchers at Oak Ridge National Laboratory (ORNL) are using the most powerful supercomputer in the country to identify “flexible” regions of the virus’ spike protein, which could prove to be viable targets for new therapeutics.
The researchers began with nanoscale molecular dynamics simulations of the spike proteins of four coronaviruses: SARS-CoV-2, SARS-CoV-1, MERS-CoV and HCoV-HKU1. With those simulations done, the researchers then applied a convolutional variational autoencoder (a kind of deep learning architecture) to compare SARS-CoV-2 with those other coronaviruses.
“This deep learning technique transforms massive amounts of data into manageable amounts of data while ensuring everything remains intact and accurate,” said Debsindhu Bhowmik, a computational scientist at ORNL, in an interview with Oak Ridge’s Elizabeth Rosenthal. “We compressed the whole protein into a single dot that we can plot on a graph to see how the structure evolves over time.”

They were looking for—and found—evidence that SARS-CoV-2 had similar flexible regions to the other coronaviruses, suspecting that disruption of such regions might be able to destabilize the virus. “All of these coronaviruses have protomers that assemble to form a trimer, which means they have inherently flexible structures that can potentially be manipulated during assembly,” said Serena Chen, also a computational scientist at ORNL.
With that similarity proven, the researchers honed in, identifying components called “beta sheets” that were crucial to the structural integrity of SARS-CoV-2. “We think these two regions are involved in helping the spike protein form the trimer,” Chen said. “Applying treatments to these regions could potentially prevent the virus from completing this process and infecting host cells.” The researchers now believe that interrupting the formation of the beta sheets may debilitate the virus.
These beta sheets are found in the S2 domain of the virus. Previous work, by contrast, had predominantly focused on the receptor binding domain (RBD), which is more directly connected to the virus’ mechanism of infecting human cells. “A better understanding of the spike protein could complement current Covid-19 vaccines by informing new treatments and providing insights into potential drug design,” Bhowmik said. “Studying the whole spike protein in detail allowed us to locate promising targets for medical approaches beyond those that have already been identified in RBD.”
The computational firepower for this work came in the form of Summit, which remains the United States’ top-ranked supercomputer at 148.6 Linpack petaflops. “At the beginning of this project, our back-of-the-envelope calculations revealed that we would need to run many simulations and generate an enormous amount of data to draw scientific conclusions,” said John Gounley, another of the computational scientists on the team. “Summit provided the immense compute power we needed to handle that workload.”
To learn more, read the reporting from ORNL’s Elizabeth Rosenthal here.