CSCS Top Right Frontpage
HPCwire

Since 1986 - Covering the Fastest Computers
in the World and the People Who Run Them

Language Flags

Visit additional Tabor Communication Publications

Datanami
Digital Manufacturing Report
HPC in the Cloud
Green Computing Report

Tabor Communications
Corporate Video

Carnegie Mellon to Develop Machine Learning Tools to Analyze Genomes


Ion Torrent sponsors multi-university research effort

PITTSBURGH, Pa., Jan. 10 — Scientists at Carnegie Mellon University say advanced computational tools will be the key to a new research project that, if successful, could enable doctors to routinely use information extracted from a patient’s DNA to diagnose and guide treatment of diseases.

Ion Torrent, a unit of Life Technologies Corporation (NASDAQ: LIFE), is sponsoring the project during its first year, and more funding is expected to come through federal grants and other sources.  Robert F. Murphy, director of the Lane Center for Computational Biology in Carnegie Mellon’s School of Computer Science, will lead a multidisciplinary team of researchers that will collaborate with scientists at the Baylor College of Medicine and Yale University.

The ultimate dream, Murphy said, is to develop what Ion Torrent Founder and CEO Jonathan M. Rothberg dubbed “doctor in a box” software. Doctor-in-a-box would take a patient’s DNA sequence and use it to diagnose disease, identify a patient’s susceptibility to disease, and predict which therapies might be most effective or cause the fewest side effects. The size and complexity of the human genome, which was first sequenced in its entirety in 2003, has stymied efforts to date to create such software.

“There’s just way too much information for doctors to make sense of it all,” Murphy said. But new machine learning tools — statistically driven software that can detect associations within mountains of data — may soon be able to translate the genetic and other hereditary information encoded in the human genome in a way that is clinically relevant to doctors and patients, he added. His team isn’t the first to use machine learning to analyze whole genomes, however it will employ some unique software developed at Carnegie Mellon. 

The Lane Center includes a number of faculty who are leaders in aspects of the problem, including Eric Xing, Ziv Bar-Joseph, Kathryn Roeder, Russell Schwartz and Seyoung Kim.

The team’s software will be trained specifically to analyze the type of whole-genome sequence data produced by Ion Torrent’s unique sequencing technology, which is ideal for clinical applications because it is designed to sequence the entire human genome in a day for just $1,000. Up to now, routine clinical use of whole genome sequencing has been impractical because it’s taken weeks to complete at a cost of about $10,000. Now that Ion Torrent can reduce the time and expense, the next step is creating a tool to enable doctors to easily integrate whole genome sequencing into medical practice, Rothberg said.

 “The promise of ‘doctor-in-a-box’ is that by using artificial intelligence, like we’ve seen with IBM’s ‘Watson’ computer, we will be able to associate the variations in the human genome with the vast amount of information we have about human health,” said Rothberg (E’85). “The work the Carnegie Mellon team is undertaking opens up the possibility that practicing physicians will be able to diagnose disease, identify disease susceptibility and guide therapy selection as easily as they can now use Apple’s Siri on the iPhone.” 

 “It’s an enormous undertaking,” Murphy agreed, “but we are creating a framework that will allow us to tackle this problem one piece at a time and to do so at a scale that makes sense when all of those pieces are put together.”

The sheer size of the problem necessitates collaboration with other groups trying to understand the genome, so Murphy said the team intends to make its software available as open source.

During the first year, researchers will focus on identifying the genomic features associated with a single disease or patient population, which has yet to be selected. Researchers at Baylor’s Human Genome Sequencing Center and Yale’s Center for Genome Analysis will perform the whole genome sequencing of patients and provide longitudinal medical records, such as disease treatments and outcomes and results of clinical tests.

This information, scrubbed of patient identity information, will be analyzed by the Carnegie Mellon researchers, who include biologists, statisticians and computational biologists, as well as other computer scientists. Machine learning programs will tease out the relationships between the genomic data and the clinical outcomes for each of the anonymous patients, while incorporating information from biomedical literature regarding gene and protein expression and disease pathways.

This analysis will yield models based on personal genome sequences that can be used to predict disease susceptibility and treatment responsiveness, as well as choose preventive therapies.

To provide impetus to the research program, Rothberg will sponsor an “Analyzing the $1,000 Genome” Conference to be held at Carnegie Mellon sometime in the summer or fall of 2012. The scientific conference will highlight outstanding work on computational analysis of genome sequences and foster discussion of new directions and strategies for extending this research.

In addition to Murphy, the research program leadership includes Jaime Carbonell, director of CMU’s Language Technologies Institute; Tom Mitchell, director of CMU’s Machine Learning Department; Richard Gibbs, director of Baylor’s sequencing center; and Shrikant Mane, director of Yale’s genome center.

Rothberg also established the Rothberg Research Awards in Human Brain Imaging at Carnegie Mellon to support the university’s faculty and students in creatively pushing research boundaries in how the brain thinks, learns and ages.

About Carnegie Mellon University

Carnegie Mellon (www.cmu.edu) is a private, internationally ranked research university with programs in areas ranging from science, technology and business, to public policy, the humanities and the arts. More than 11,000 students in the university’s seven schools and colleges benefit from a small student-to-faculty ratio and an education characterized by its focus on creating and implementing solutions for real problems, interdisciplinary collaboration and innovation. A global university, Carnegie Mellon’s main campus in the United States is in Pittsburgh, Pa. It has campuses in California’s Silicon Valley and Qatar, and programs in Asia, Australia, Europe and Mexico. The university is in the midst of a $1 billion fundraising campaign, titled “Inspire Innovation: The Campaign for Carnegie Mellon University,” which aims to build its endowment, support faculty, students and innovative research, and enhance the physical campus with equipment and facility improvements.

-----

Source: Carnegie Mellon University

Sponsored Links

High-Performance Computing in Action
Businesses that want to be on the cutting edge of their industries are increasingly turning to high-performance computing (HPC) solutions to handle complex compute processes and speed up their rate of innovation. Download this Executive Brief to see how businesses in energy, life sciences and entertainment put HPC solutions to work in their operations.

Accelerate your science with Seneca
One of the first HPC providers installing a 4X NVIDIA Kepler K-20 cluster. Invites you to a free evaluation on Seneca’s NVIDIA K20 Kepler cluster, pre-loaded with AMBER, NAMD, LAMMPS

May 17, 2013

May 16, 2013

May 15, 2013

May 14, 2013

May 13, 2013

May 10, 2013

May 09, 2013

May 08, 2013

May 07, 2013

May 06, 2013



Feature Articles

Saddling Phi for TACC’s Stampede

The Xeon Phi coprocessor might be the new kid on the high performance block, but out of all first-rate kickers of the Intel tires, the Texas Advanced Computing Center (TACC) got the first real jab with its new top ten Stampede system.We talk with the center's Karl Schultz about the challenges of programming for Phi--but more specifically, the optimization...
Read more...

"No Exascale for You!" An Interview with Berkeley Lab's Horst Simon

Although Horst Simon was named Deputy Director of Lawrence Berkeley National Laboratory, he maintains his strong ties to the scientific computing community as an editor of the TOP500 list and as an invited speaker at conferences.
Read more...

Supercomputing Vet Champions Quantum Cause

Supercomputing veteran, Bo Ewald, has been neck-deep in bleeding edge system development since his twelve-year stint at Cray Research back in the mid-1980s, which was followed by his tenure at large organizations like SGI and startups, including Scale Eight Corporation and Linux Networx. He has put his weight behind quantum company....
Read more...

Short Takes

Running Computational Fluid Dynamics in the Cloud

May 16, 2013 | When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
Read more...

Computing the Physics of Bubbles

May 15, 2013 | Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
Read more...

Internet2 Awards Program Seeks Innovative Applications

May 10, 2013 | Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
Read more...

Floating Funding to Exascale Island

May 09, 2013 | The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
Read more...

HPC and the True Cost of Cloud

May 08, 2013 | For engineers looking to leverage high-performance computing, the accessibility of a cloud-based approach is a powerful draw, but there are costs that may not be readily apparent.
Read more...

Sponsored Whitepapers

Best Practices in Big Data Storage

05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.

Progress in Parallel: the Bull Parallel Programming Center

04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.

Sponsored Multimedia

SGI DMF ZeroWatt Disk Solution

In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.

Cray CS300-AC Cluster Supercomputer Air Cooling Technology Video

The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.

SC12 Editorial Feature HPCwire Soundbite sponsored by ISC

HPC Job Bank


Featured Events


  • June 16, 2013 - June 20, 2013
    ISC'13
    Leipzig,
    Germany

  • June 17, 2013 - June 18, 2013
    Forecast 2013
    San Francisco, CA
    United States





HPCwire Events