When it comes to mitigating infectious disease outbreaks, like Ebola, time is of the essence. Researchers at Virginia Bioinformatics Institute (VBI) rely on rapid-response agent modeling to help public health organizations determine what steps to take in the face of deadly outbreaks. Based on factors such as demographics, family structures, travel patterns and other activities, the models shed light on how the disease is progressing regionally and globally.
VBI is one of the foremost research institutions using agent-based simulations to model biological systems, from cells to people, cities and countries. To handle both data- and compute-intensive models, VBI depends on its 2,500 core compute cluster and 1 petabyte of high-performance storage from DataDirect Networks (DDN).
DDN’s GRIDScaler GPFS parallel file system is another component that enables researchers and scientists at VBI to perform rapid, accurate Ebola outbreak modeling for the U.S. Department of Defense’s Defense Threat Reduction Agency (DTRA). As VBI computational epidemiologist Caitlin Rivers asserts, “Decision makers can’t wait for the outbreak to be over in order to make their decisions.”
Rivers describes how her team was able to provide in-depth analysis for the DoD rapid-response agency in just 48 hours, helping them decide the best locations for emergency treatment units in Sierra Leone, Guinea and Liberia.
“We received a call on Friday that the Department of Defense wanted some insight into where they should place hospital units,” says Rivers. “We were able to do simulations to optimize the amount of time that any individual would have to travel in order to reach a hospital unit. The data storage and the technology component is really critical to being able to provide that level of detail in the simulations. With Ebola, each sick person on average infects two other people and over time that’s exponential growth so a single infected person infects two people, then four people, and so on. That information had to be transmitted back to the DoD by Monday morning because a plane was about to leave with the supplies to build these units.”
The simulations employed anonymized “synthetic data” drawn from actual census, social, transit and telecommunications data patterns. Researchers used internally developed HPC modeling tools, including EpiFast, EpiSimdemics, and Indemics, as well as open-source data analysis tools Panda and Python.
In addition to speed and urgency, scalability is another key factor for VBI researchers engaged in infectious disease analysis. Data is increasing in multiple dimensions to the point where VBI is experiencing nearly 100 percent data growth year-over-year. Already VBI has expanded its DDN storage system from 300 terabytes to just over 1 petabyte and plans to add more capacity.
“The variety of data we gather as part of our modeling process drives the incredible amount of detail within our models as well as the output of each model,” states Kevin Shinpaugh, PhD, director of IT and High Performance Computing at VBI. “With DDN storage, we’re confident we can scale data storage to address both current and future modeling demands while expediting accurate responses during an emerging crisis.”