Yahoo! Inc., a leading global Internet company, today announced that it will be the first in the industry to launch an open source program aimed at advancing the research and development of systems software for distributed computing. Yahoo’s program is intended to leverage its leadership in Hadoop, an open source distributed computing sub-project of the Apache Software Foundation, to enable researchers to modify and evaluate the systems software running on a 4,000-processor supercomputer provided by Yahoo. Unlike other companies and traditional supercomputing centers, which focus on providing users with computers for running applications and for coursework, Yahoo’s program focuses on pushing the boundaries of large-scale systems software research.
Currently, academic researchers lack the hardware and software infrastructure to support Internet-scale systems software research. To date, Yahoo has been the primary contributor to Hadoop, an open source distributed file system and parallel execution environment that enables its users to process massive amounts of data. Hadoop has been adopted by many groups and is the software of choice for supporting university coursework in Internet-scale computing. Researchers have been eager to collaborate with Yahoo and tap the company’s technical leadership in Hadoop-related systems software research and development.
As a key part of the program, Yahoo intends to make Hadoop available in a supercomputing-class datacenter to the academic community for systems software research. Called the M45, Yahoo’s supercomputing cluster, named after one of the best known open star clusters, has approximately 4,000 processors, 3TB of memory, 1.5 petabytes of disks, and a peak performance of more than 27 teraflops, placing it among the top 50 fastest supercomputers in the world.
M45 is expected to run the latest version of Hadoop and other state-of-the-art, Yahoo-supported, open-source distributed computing software such as the Pig parallel programming language developed by Yahoo Research, the central advanced research organization of Yahoo Inc.
Carnegie Mellon University will be the first institution to take advantage of Yahoo’s M45. Leading systems software researchers Garth Gibson and Greg Ganger, both professors at Carnegie Mellon, will instrument the system and evaluate its performance. Simultaneously, Carnegie Mellon computer science professors Jamie Callan and Christos Faloutsos, academic leaders in text and Web mining, will solve challenging information retrieval and large-scale graph problems on the cluster. Carnegie Mellon faculty members Alexei Efros, Noah Smith, and Stephan Vogel will also use the cluster to tackle large-scale computer graphics, natural language processing, and machine translation problems, respectively. In the future, Yahoo plans to make M45 available to researchers from other universities for open, collaborative research.
“Hadoop has become an important computing environment for data-intensive applications and Yahoo is playing a leading role in its development. We are excited about collaborating with Yahoo on systems software research, helping to advance the state-of-the-art, and creating new research possibilities in this critical area,” said Randall E. Bryant, dean of the School of Computer Science at Carnegie Mellon. “We look forward to working with Yahoo and jointly contributing back to the open source community.”
“Yahoo is dedicated to working with leading universities to solve some of the most critical computing challenges facing our industry,” said Ron Brachman, vice president and head of Yahoo academic relations. “Launching this program and M45 is a significant milestone in creating a global, collaborative research community working to advance the new sciences of the Internet. This milestone is a key element of Yahoo’s growing Academic Relations effort.”
Yahoo! Inc. is a leading global Internet brand and one of the most trafficked Internet destinations worldwide. Yahoo is focused on powering its communities of users, advertisers, publishers, and developers by creating indispensable experiences built on trust. Yahoo is headquartered in Sunnyvale, Calif. For more information, visit pressroom.yahoo.com or the company’s blog, Yodel Anecdotal.