NCSA
HPCwire

Since 1986 - Covering the Fastest Computers
in the World and the People Who Run Them

Language Flags

Visit additional Tabor Communication Publications

Datanami
Digital Manufacturing Report
HPC in the Cloud

Yahoo Partners w/ Top Universities for Cloud Research


University of California at Berkeley, Cornell University and the University of Massachusetts at Amherst join Carnegie Mellon University to take advantage of Yahoo's cloud computing resources

SUNNYVALE, Calif., April 9 -- Yahoo! Inc., a leading global Internet company, today announced it has expanded its partnerships with top U.S. universities to advance cloud computing research. The University of California at Berkeley, Cornell University and the University of Massachusetts at Amherst will join Carnegie Mellon University in using Yahoo's cloud computing cluster to conduct large-scale systems software research and explore new applications that analyze Internet-scale data sets, ranging from voting records to online news sources.

To date, academic researchers have had limited access to Internet-scale supercomputers for conducting systems and applications research. To help alleviate this obstacle, Yahoo is granting these four universities access to the Yahoo cloud computing cluster. The Yahoo cluster, also known as M45, has been operational since November 2007 and in use by Carnegie Mellon. The cluster has approximately 4,000 processor-cores and 1.5 petabytes of disks.

"We have been using the Yahoo cluster for more than a year now and have made significant progress in a number of key research areas, resulting in the publication of more than two dozen academic papers," said Randal E. Bryant, dean of the School of Computer Science at Carnegie Mellon. "Our researchers were able to extract and process documents from the Web in a way that was not possible before, changing the way we think about research problems. We were also able to conduct research over a corpus of 200 million Web pages, processing two orders of magnitude more data. We conducted systems software research, comparing, for example, the performance of the Hadoop file system and other parallel file systems. The simultaneous access to applications and systems software has been a real benefit and we look forward to our continued partnership with Yahoo and joint contributions to the cloud computing community."

Yahoo's M45 cluster runs Hadoop, an open source distributed file system and parallel execution environment that enables its users to process massive amounts of data. Apache Hadoop is an open source project of the Apache Software Foundation, to which Yahoo engineers have been the primary contributors to date.

"Hadoop powers many of our most broadly used and complex systems at Yahoo, from Web search to optimizing content for the home page," said Shelton Shugar, senior vice president of cloud computing at Yahoo. "Continuing to invest in the open source community and in technologies like Hadoop is an important element in our efforts to drive breakthroughs in Internet-scale computing and ultimately to continually improve the quality of the consumer experience of Yahoo. By partnering with these top educational institutions to share our M45 cluster and our technical expertise, we hope to further key insights into the next generation of systems software research and development."

"We are very excited about the new research partnership with Yahoo," said Shankar Sastry, dean of the College of Engineering at the University of California, Berkeley. "Access to the cluster is a first step in helping us analyze the vast amounts of societal-scale information available on the Web, such as voting records, online news sources and polling data. The Yahoo cluster will also enable us to conduct computationally intensive econometrics research, combining economic theory with statistics to analyze and test large-scale economic relationships."

"Our partnership with Yahoo will enable us to attack problems ranging from wildlife preservation and biodiversity, to balancing socio-economic needs and the environment, to large-scale deployment and management of renewable energy sources," said Bob Constable, dean of the faculty of Computing and Information Science at Cornell University. "We recently established the Institute of Computational Sustainability at Cornell to focus on computational problems in these areas, and Yahoo's cluster will help us solve large scale optimization and machine learning problems to find better ways to manage our natural resources."

"Our vision is to improve upon current technology through the processing of large data sets," said Jim Kurose, dean of College of Natural Sciences and Mathematics at the University of Massachusetts, Amherst. "Yahoo's supercomputing cluster will enable us to do data-intensive research on a large set of scanned books drawn from the Internet Archive's million-book collection. The latter includes 8.5 terabytes of text and half a petabyte of scanned images. Research on such large datasets would not be possible without the use of clusters like the one Yahoo is offering us access to."

Partnership with these universities is the next step in expanding Yahoo's leadership in supporting cloud computing research. In July 2008, Yahoo joined forces with HP, Intel, the University of Illinois at Urbana-Champaign, the Infocomm Development Authority (IDA) in Singapore, and the Karlsruhe Institute of Technology (KIT) in Germany to create Open Cirrus, a global, multi-data center, open source testbed for advancing cloud computing research and education. The partnership with Illinois also includes the National Science Foundation, creating a cloud computing cluster that is made available to the entire reach of the NSF academic community. The international partnership promotes open collaboration among industry, academia and governments by removing the financial and logistical barriers to research in data-intensive, Internet-scale computing. As the Yahoo M45 cluster is part of the Open Cirrus cloud computing testbed, the above universities will also gain access to and be part of the Open Cirrus community.

"Yahoo is dedicated to working with leading universities to solve some of the most critical computing challenges facing our industry," said Ron Brachman, vice president and head of Yahoo Academic Relations. "The ability to access and analyze massive data sets is becoming increasingly crucial to the advancement of Internet-related computer science and cross-disciplinary research. By expanding our university-facing cloud computing program to partner with more universities, we hope to catalyze data-intensive computing research, furthering our commitment to the global, collaborative research community advancing the new sciences of the Internet."

About Yahoo!

Yahoo! Inc. (Nasdaq:YHOO) is a leading global Internet brand and one of the most trafficked Internet destinations worldwide. Yahoo! is focused on powering its communities of users, advertisers, publishers, and developers by creating indispensable experiences built on trust. Yahoo! is headquartered in Sunnyvale, Calif. For more information, visit pressroom.yahoo.com or the company's blog, Yodel Anecdotal.

-----

Source: Yahoo! Inc.

HPCwire on Twitter

Discussion

There are 0 discussion items posted.

Join the Discussion

Join the Discussion

Become a Registered User Today!


Registered Users Log in join the Discussion

May 23, 2012

May 22, 2012

May 21, 2012

May 18, 2012

May 17, 2012

May 16, 2012

May 15, 2012

May 14, 2012

May 11, 2012

May 10, 2012


Most Read Features

Most Read Around the Web

Most Read This Just In

Acer

Feature Articles

NVIDIA Works On CPU Co-Dependency Issues with Kepler GPU

NVIDIA is telling everyone that the GK110, its new Kepler GPU aimed at supercomputing, is all about improving performance per watt. But the other driving theme behind the new architecture is reducing the GPU's reliance on its CPU host. How well it accomplishes both these goals areas could determine the success of the new chip in high performance computing.
Read more...

OpenACC Starts to Gather Developer Mindshare

PGI, Cray, and CAPS enterprise are moving quickly to get their new OpenACC-supported compilers into the hands of GPGPU developers. At NVIDIA's GPU Technology Conference this week, there was plenty of discussion around the new HPC accelerator framework, and all three OpenACC compiler makers, as well as NVIDIA, were talking up the technology.
Read more...

NVIDIA Launches Kepler Into HPC

NVIDIA has introduced its first Kepler-generation GPU product for high performance computing, and revealed some of the inner working of the new architecture. The announcement took place at the kickoff of the company's GPU Technology Conference taking place this week in San Jose, California.
Read more...

Around the Web

Apple Datacenter Blooms Green Energy

May 22, 2012 | Company looks to renewable energy to power its computing infrastructure.
Read more...

NVIDIA’s Bill Dally Talks 3D Chips and More at GTC

May 16, 2012 | Chief scientist discusses memory stacks, interconnects, and US technology leadership.
Read more...

NVIDIA Unveils Virtualized GPU with Kepler-Based Board

May 15, 2012 | GPU maker conjures up visualization technology for virtual desktops.
Read more...

Zettaflops Will Happen Says HPC Analyst

May 14, 2012 | Pessimistic predictions about technology have a poor track record, according to 451's John Barr.
Read more...

Next-Gen Memory on the Horizon

May 10, 2012 | DRAM manufacturers gear up for DDR4.
Read more...

Sponsored Whitepapers

Sponsored Multimedia

ISC Think Tank 2012

Newsletters

Intersect360 HPC500

HPC Job Bank


Featured Events







HPC Wire Events