Feb. 27, 2020 — GROMACS—one of the most widely used HPC applications— has received a major upgrade with the release of GROMACS 2020. The new version includes exciting new performance improvements resulting from a long-term collaboration between NVIDIA and the core GROMACS developers.
As a simulation package for biomolecular systems, GROMACS evolves particles using the Newtonian equations of motion. Forces dictate movement: for example, two positively charged ions repel each other. Calculating forces is the most expensive part of the simulation because all pairs of particles can potentially interact, and simulations involve many particles.
GROMACS provides functionality to account for a wide range of different types of force calculations. For most simulations, the three most important classes (in terms of computational expense) are:
- Non-bonded short-range forces: Particles within a certain cutoff range are considered to interact directly.
- Particle Mesh Ewald (PME) long-range forces: For larger distances, the forces are modeled through a scheme where Fourier transforms are used to perform calculations in Fourier space. This is much cheaper than calculating all interactions directly in real space.
- Bonded short-range forces: Also required due to specific behavior of bonds between particles, for example, the harmonic potential when two covalently bonded atoms are stretched.
In previous GROMACS releases, GPU acceleration was already supported for these force classes. The most recent addition was GPU bonded forces in the 2019 series, developed through a previous collaboration between NVIDIA and the core GROMACS developers.
However, there was still a problem. The force calculations have become so fast on modern GPUs that other parts of the simulation have become very significant in term of computational expense, especially when you want to use multiple GPUs for a single simulation.
This post describes the new performance features available in the 2020 version that address this issue. For many typical simulations, the whole timestep can now run on the GPU, avoiding CPU and PCIe bottlenecks. Inter-GPU communication operations can now operate directly between GPU memory spaces. The results from this work demonstrate large performance improvements.
Visit the Nvidia blog article to read the rest.
Source: Alan Gray, Nvidia Corp.