Although we’ve yet to settle on a term for it, the convergence of HPC and a new generation of big data technologies is set to transform science. The compute-plus-data mantra reaches all the way to the White House with President Obama’s National Strategic Computing Initiative calling for useful exascale computing and sophisticated data capabilities that serve the nation’s overarching goals around security, innovation and competitiveness.
A champion of this paradigm, the National Science Foundation has been directing its resources toward providing the infrastructure and tools necessary to advance data-driven science at multiple scales.
The NSF is making $2.42 million available for a unique facility at the University of Michigan, where a new computing resource, called ConFlux, will enable supercomputer simulations to interface with large datasets while running. According to a bold statement from the University of Michigan, “this capability will close a gap in the U.S. research computing infrastructure and place U-M at the forefront of the emerging field of data-driven physics.”
The university will provide an additional $1.04 million toward the project, which will begin with the construction of the new Center for Data-Driven Computational Physics. Among the fields that will be supported are aerodynamics, climate science, cosmology, materials science and cardiovascular research.
ConFlux will enhance traditional physics-based computer models with big data techniques. The design strategy calls for specialized supercomputing nodes matched to the needs of data-intensive operations. Enabling technologies include next-generation processors, GPUs, large memories, ultra-fast interconnects, and a three-petabyte hard drive.
“Big data is typically associated with web analytics, social networks and online advertising. ConFlux will be a unique facility specifically designed for physical modeling using massive volumes of data,” said Barzan Mozafari, U-M assistant professor of computer science and engineering, who will oversee the implementation of the new computing technology.
In the case of very complex, physics-based computer models that involve scales that outside the realm of modern computing’s capabilities, approximations are employed. Researchers have developed sophisticated techniques to manage tradeoffs, but accuracy is still sacrificed. ConFlux will use machine learning algorithms to create more reliable models trained with a mix of scaled-down models and observational and experimental data.
There are five main studies included in the complex, physics-based computer models grant. All five deal with issues of scale and will be the first to use the new system.
Cardiovascular disease: Noninvasive imaging such as MRI and CT scans could enable doctors to deduce the stiffness of a patient’s arteries, a strong predictor of diseases such as hypertension. By combining the scan results with a physical model of blood flow, doctors could estimate artery stiffness within an hour of the scan. The study is led by Alberto Figueroa, the Edward B. Diethrich M.D. Research Professor of Biomedical Engineering and Vascular Surgery.
Turbulence: When a flow of air or water breaks up into swirls and eddies, the pure physics equations become too complex to solve. But more accurate turbulence simulation would speed up the development of more efficient airplane designs. It will also improve weather forecasting, climate science and other fields that involve the flow of liquids or gases. Duraisamy leads this project.
Clouds, rainfall and climate: Clouds play a central role in whether the atmosphere retains or releases heat. Wind, temperature, land use and particulates such as smoke, pollen and air pollution all affect cloud formation and precipitation. Derek Posselt, associate professor of climate and space sciences and engineering, and his team plan to use computer models to determine how clouds and precipitation respond to changes in the climate in particular regions and seasons.
Dark matter and dark energy: Dark matter and dark energy are estimated to make up about 96 percent of the universe. Galaxies should trace the invisible structure of dark matter that stretches across the universe, but the formation of galaxies plays by additional rules—it’s not as simple as connecting the dots. Simulations of galaxy formation, informed by data from large galaxy-mapping studies, should better represent the roles of dark matter and dark energy in the history of the universe. August Evrard and Christopher Miller, professors of physics and astronomy, lead this study.
Material property prediction: Material scientists would like to be able to predict a material’s properties based on its chemical composition and structure, but supercomputers aren’t powerful enough to scale atom-level interactions up to bulk qualities such as strength, brittleness or chemical stability. An effort led by Krishna Garikipati and Vikram Gavini, professor and associate professor of mechanical engineering, respectively, will combine existing theories with the help of data on material structure and properties.