Simulating large biomolecules has long been challenging. Now, researchers from Los Alamos National Laboratory, RIKEN Center for Computational Science in Japan, the New Mexico Consortium, and New York University have successfully created the first billion-atom simulation of an entire gene using a new approach they devised that reduces computational costs for such large simulations.
Their work was published last week in the Journal of Computational Chemistry (Scaling molecular dynamics beyond 100,000 processor cores for large‐scale biophysical simulations) and an account was posted yesterday on the LANL web site. Their study focused on simulating how DNA expands and contracts and is a first step in showing how that activity helps control DNA gene expression.
“It is important to understand DNA at this level of detail because we want to understand precisely how genes turn on and off,” said Karissa Sanbonmatsu, a structural biologist at Los Alamos and author of the paper, in the article on LANL’s web site. “Knowing how this happens could unlock the secrets to how many diseases occur.” It’s worth noting there is enough DNA in the human body to wrap around the earth 2.5 million times, which means it is compacted in a very precise and organized way.
New approaches to using accelerated-systems (GPUs) and special purpose supercomputers (Anton 1 & 2) have speeded and expanding biomolecular simulation capabilities in recent years but are still limited. “As a result of recent developments in hardware and software, large systems consisting of 64 and 100 million atoms can be simulated with MD,” write the researchers in their paper.
The latest work, run on LANL’s Trinity supercomputer, uses novel approaches to reducing the computational cost of running simulation. “For small systems, the real-space nonbonded interaction becomes the main computational challenge. Conversely, the main bottleneck moves to evaluation of reciprocal space nonbonded interactions as we increase the number of computational processes or increase the target system size,” report the researchers. “We have developed the GENESIS MD software to overcome current size limitations of MD.”
This description is from their paper’s abstract:
“A novel algorithm was implemented for nonbonded interactions to increase single instruction multiple data (SIMD) performance, reducing memory usage for ultra large systems. Memory usage for neighbor searches in real-space nonbonded interactions was reduced by approximately 80%, leading to significant speedup. Using experimental data describing physical 3D chromatin interactions, we constructed the first atomistic model of an entire gene locus (GATA4). Taken together, these developments enabled the first billion-atom simulation of an intact biomolecular complex, achieving scaling to 65,000 processes (130,000 processor cores) with 1 ns/day performance.”
“Right now, we were able to model an entire gene with the help of the Trinity supercomputer at Los Alamos,” said Anna Lappala, a polymer physicist at Los Alamos and also an author on the paper. “In the future, we’ll be able to make use of exascale supercomputers, which will give us a chance to model the full genome.”
Link to Journal of Computational Chemistry paper: https://onlinelibrary.wiley.com/doi/full/10.1002/jcc.25840
Feat image: Los Alamos National Laboratory