Accurate simulation of cancer-implicated proteins holds enormous promise for basic biomedical science and development of effective therapies, but the high computational cost required has long slowed progress. Recently a multi-institution research team developed a machine learning-based simulation for next-generation supercomputers capable of modeling protein interactions and mutations that play a role in many forms of cancer. Their work on simulating the RAS protein family will be published at SC19 and is a finalist for the Best Paper award.
RAS proteins are implicated in roughly one third of cancers, and research to obtain a more detailed understanding of how they interact with the cell’s lipid membranes and influence signaling pathways has long been pursued. One way to shortcut the simulations needed and to reduce the computational cost is to use ML to zoom in on areas of interest.
For its paper, the team simulated the interaction between RAS and eight of the most relevant lipids to investigate RAS dynamics and interaction. They simulated a 1-by-1 micrometer membrane patch with 300 different RAS proteins to analyze the membranes in order to generate statistically relevant observations that can be tested experimentally at Frederick National Laboratory for Cancer Research.
“We decided that instead of doing what traditionally has been done with simulations — taking a model membrane with one or two lipids — that we’d try to make it realistic and model a biologically relevant membrane,” said LLNL computational biologist Helgi Ingólfsson, a technical lead on the project. “The goal is to characterize RAS aggregation, RAS-protein interactions and RAS-lipid interactions, observing what types of lipids dictate RAS behavior and orient on the membrane. We want to see if we can modulate RAS activity with different types of lipids or some kind of pharmaceutical, not to eliminate RAS activity but modulate it in different ways, like promoting the inactive states.”
The research stems from a pilot project in the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C) program, a collaboration between the Department of Energy’s (DOE) Office of Science, the National Nuclear Security Administration (NNSA) and National Cancer Institute (NCI) that is supported in part by the Cancer Moonshot. Researchers from Lawrence Livermore National Laboratory, Los Alamos National Laboratory, the National Cancer Institute, Frederick National Laboratory for Cancer Research (FNLCR), IBM, and other institutions participated.
An article describing the work is posted on the Lawrence Livermore National Lab website (light edited excerpt):
“The team began with a macro-model capable of simulating the impact of a lipid membrane on RAS proteins at long timescales and incorporated a machine learning algorithm to determine which lipid “patches” were interesting enough to model in more detail with a molecular-level micromodel. The result is a Massively parallel Multiscale Machine-Learned Modeling Infrastructure (MuMMI) that scales up efficiently on large, heterogenous high performance computing machines like LLNL’s Sierra and ORNL’s Summit.
“For the microscale model, the team used a molecular dynamics code adopted for the coarse-grained Martini model. It was adapted for GPUs to run on Sierra, making it likely the only general molecular dynamics code to run completely on GPUs, the researchers said. The work stretched the limits of the early access Sierra system, as each “patch,” representing an area of about 30 by 30 nanometers, contained about 140,000 coarse-grain beads and thousands of individual lipids.
“While the system was still in its unclassified environment, the team ran nearly 120,000 simulations on Sierra, taking 5.6 million GPU hours of compute time and generating a massive 320 terabytes of data. The number of simulations was “staggering,” researchers said, adding that the largest number of Martini simulations done at one time was only in the thousands prior to this project.
LLNL computer scientist and lead author Francesco Di Natale, who will present the paper at the conference.
Link to LLNL article: https://www.llnl.gov/news/lab-leads-effort-model-proteins-tied-cancer
Feature Art Caption:Lawrence Livermore National Laboratory researchers, along with scientists from Los Alamos National Laboratory, the National Cancer Institute and other institutions, are using machine learning as a virtual magnifying glass to study interesting regions of RAS protein/lipid simulations in higher detail. Credit: Tim Carpenter/LLNL