The effort to attack cancer with HPC resources has been growing for years. Indeed, it’s accurate to say the sequencing of the human genome was as much a tour de force of HPC as of the new DNA sequencers. Back in June, Department of Energy Secretary Ernst Moniz blogged on the effort (Supercomputers are key to the Cancer Moonshot) and in a few weeks the opening panel at SC16 is on Precision Medicine.
Among the panelists are Warren Kibbe, director of the Center for Biomedical Informatics and Information Technology (CBIIT) at the NCI, which he helped establish. Steve Conway of IDC will moderate the panel Monday evening (November 14, 2016), HPC and Precision Medicine: Researchers Are on the Brink of Finding Cures to Cancer and Other Deadly Diseases within This Generation but Only with the Power of HPC.
Part of the NCI effort involves changing the way diverse, geographically spread researchers work. Begun two years ago and recently extended for another year, NCI has three Cancer Genomics Cloud Pilots being centered at the Broad Institute, Institute for Systems Biology (ISB), and Seven Bridges Genomics. The cloud pilots work in conjunction with NCI’s Genomic Data Commons (GDC) initiative, which is a data sharing platform that promotes precision medicine in oncology. “It is not just a database or a tool; it is an expandable knowledge network supporting the import and standardization of genomic and clinical data from cancer research programs,” according to NCI.
The GDC contains NCI-generated data from some of the largest and most comprehensive cancer genomic datasets, including The Cancer Genome Atlas (TCGA) and Therapeutically Applicable Research to Generate Effective Therapies (TARGET). For the first time, these datasets have been harmonized using a common set of bioinformatics pipelines, so that the data can be directly compared.
Below is an excerpt from NCI’s description of the Cancer Genomics Cloud Pilots:
- “The traditional model for analyzing genomic data involves individual researchers downloading data stored at a variety of locations, adding their own data, attempting to harmonize the data, and then computing over these data on local hardware. While this model has been successful for many years, it has become unsustainable given the enormous growth of biomedical data due to the prevalent use of next-generation sequencing technology in large scientific programs. The size of the data makes access and analysis difficult for anyone but the best-resourced institutions, in terms of both storage and computing capability…
- “Key design principles for the CGC Pilots include: APIs for secure tool and data access, usability for biologists and clinicians as well as bioinformaticists and application developers, scalability, sustainability, extensibility to new data types without major refactoring, and open source, non-viral software licenses.”
All three CGC Pilots have chosen to implement their systems through commercial cloud providers – AWS and Google – and are collaborating on adopting common standards. Beyond these commonalities, the three project teams have distinct system designs, data presentation, and analysis resources to serve the cancer research community.”
Moniz’s June blogpost, though focused on supercomputers, captures the role of HPC in medical research: “Supercomputers are key to the Cancer Moonshot. These exceptionally high-powered machines have the potential to greatly accelerate the development of cancer therapies by finding patterns in massive datasets too large for human analysis. Supercomputers can help us better understand the complexity of cancer development, identify novel and effective treatments, and help elucidate patterns in vast and complex data sets that advance our understanding of cancer.”
The panel at SC, which will tackle precision medicine broadly, should be fascinating. Joining Kibbe on the panel are Mitchell Cohen, UCSF and University of Colorado, Marti Head, Senior Director, GlaxoSmithKline, Dimitri Kusnezov, DOE Chief Scientist & Senior Advisor to the Secretary, Department of Energy, NNSA, Steve Scott, Cray, Chief Technology Officer.