The United States is arguably in the midst of a health care crisis, but there is hope on the horizon and it involves learning how to make sense of big data. Over at Communications of the ACM, Oak Ridge National Laboratory (ORNL) shares how it is helping the health care industry benefit from patient data using the power of graph computing.
Starting about four years back, researchers at the lab saw an opportunity to use their data science and computing prowess for the betterment of health care. The project is rather unique in that it leverages three different high-performance computing architectures. The multicore Cray XK7 supercomputer Titan, currently the second-most powerful computer in the world, is being used to simulate outcomes of interventions. Apollo, the in-memory graph-computing Urika appliance also built by Cray, is enabling actionable pattern discovery. And cloud computing machines with distributed storage are providing further analysis.
One challenge of the American health care system is the tendency for data to end up in silos, which by their nature are not easily joined. Another problem is with patient data itself. It exists in a variety of formats, both structured and unstructured, and it often exists in massive volumes.
“ORNL computing experts found a better approach in scalable graph computing,” notes the author of the ACM article, “which allows for detailed analysis and discovery of relationships hiding in large quantities of data. By organizing health care data into relationship graphs (linked structures of interacting entities), researchers were able to mine and understand complex patterns of relationships and behaviors in health care delivery.”
The team took publicly available datasets from several sources, including The Cancer Genome Atlas, clinicaltrials.gov, Semantic MEDLINE, openFDA, DocGraph, National Plan and Provider Enumeration System, as well as clinical partnerships.
Graph computing was highly effective at finding areas of wastefulness and fraudulent activity within the federal health care system. In one case, the crime was carried out via a type of identity fraud when a health care provider created multiple identities to bill patients. In another example, the system was able to predict which providers were the most risky, based on associations.
Finding ways to extract meaningful information from patient data is a critical step toward making health care effective, efficient, affordable and sustainable. The ORNL project shows how health care data can be sliced and diced in unusual ways for pattern discovery and predictive modeling, highlighting areas that work well and those that require attention. ORNL’s Health Data Sciences Institute (HDSI), the group behind the effort, anticipates that these methods and tools will be beneficial for other partners as well, in fields ranging from genomics to electronic health records to health-sensor data.
The original article is available here.