The HPC community has been following IBM’s Watson technology since a semi-personified version of the analytics machine became a winning Jeopardy contestant in 2011. Since then Watson has been SaaS-ified, cloud-enabled, and sent to medical school. Most recently the technology popped up in a tool called KnIT (Knowledge Integration Toolkit). Developed by IBM in partnership with Baylor College of Medicine, this prototype system scours the available scientific literature on a given topic to find hidden relationships in the data.
KnIT helped Baylor researchers identify six new proteins to target for cancer research. Considering that in the last 30 years, scientists have uncovered 28 protein targets, the fact that the Baylor team found a half a dozen in a month is an impressive feat.
It’s not that humans couldn’t do what KnIT or Watson does, it’s just that machines can do it so much faster. In a just-published paper, the researchers conclude that society is better at amassing new data than at analyzing what it already has. “This leads to deep inefficiencies in translating research into progress for humanity,” they write.
Consider the sheer number of papers that are published: about 1.5 million each year, growing by about 5 percent annually. The KnIT system employs Watson technology to mine for previously unseen connections in these massive text archives. It then creates graph-based visualizations and suggests hypotheses to help the researchers identify promising targets.
For this study, KnIT analyzed millions of papers that mentioned p53, a tumor suppressor associated with half of all cancers. The Baylor team was interested in a class of enzymes, called kinases, that can interact with p53 by switching it on and off. KnIT was tasked with searching for undiscovered p53 kinases, which could provide pathways to new cancer drugs.
The study, which first employed retrospective analysis to demonstrate the accuracy of the approach, identified six new kinases implicated in p53 activity.
In addition to expanding its use for cancer research, the research team is considering how the tool can be applied to other areas of biology, such as personalized medicine. There may also be a future for KnIT in other scientific domains, like physics, although mining equations rather than text would significantly up the challenge factor.
The researchers are clear that the Watson-based tool is not replacing scientists, but by pointing out the interesting places to look, it is helping to accelerate the discovery process.