February 14, 2011

Open Source Key to Jeopardy Supercomputer Battle

Nicole Hemsoth

In advance of the much-anticipated event beginning tonight as IBM’s Watson supercomputer takes on human champions, Ken Jennings and Brad Rutter, the Apache Software Foundation shed light on the role of its open source software, Hadoop, that gives order to Watson’s “brain.”

Hadoop is most frequently discussed in HPC in the Cloud because of its relevance to cloud computing, however it is worth noting that the software is also playing a key role in tonight’s championship to “enable Watson to access, sort, and process data in a massively parallel system (90+ server cluster/2,880 processor cores, 16 terabytes of RAM/4 terabytes of disk storage).”

Hadoop is open source software that allows for complex parallel processing tasks by providing the critical software framework that allows distributed data-intensive applications to work across thousands of nodes and petabytes of data. In the case of the IBM supercomputer as it battles it human opponents on Jeopardy, powerful software is required to manage the complex demands for sorting through data.

Apache hopes to show that open source software is capable of handling intense workloads and like IBM, is glad for the opportunity to expose mainstream audiences to concepts that might be just on the edge of notice, including supercomputer capabilities and open source packages they might never have heard of–even if their business could benefit from one or both.

During the Jeopardy competition, IBM’s supercomputer, Watson, will be humming away at 80 trillion operations per second, instantly accessing vast libraries of information against millions of logic rules to achieve correct responses to questions posed in natural speech. 

Jim Lagielski, President of the Apache Software Foundation remarked that the Jeopardy showdown demonstrates the power of open source software products, noting that “the success and influence of Watson clearly shows that open source in general, and specifically open source software developed and released by the Apache Software Foundation, is deeply entwined in all layers and aspects of technology…” and that it is “part of computing and information technology DNA, forming complete or integral solutions to advanced problems.”

