In the world of distributed computing, the big story of the past week has to be the joint Google-IBM project directed toward introducing and teaching Web-scale parallel computing technologies to university students. The IT blogosphere has been inundated with entries on the subject, general-purpose IT magazines have picked up the story, and “cloud computing” has become buzzworthy almost overnight. Given the significant amount of heat this announcement is generating, I thought I had better get connected with one of the big players involved and find out just what Google and IBM hope to get out of this project.
With this in mind, I spoke with Dennis Quan, CTO of IBM’s high performance on-demand solutions group, who filled me in on the details. Quan said that IBM and Google have been collaborating on the project for almost a year now, with the main purpose of the virtual lab being to get university students on board with the massively scalable, parallel computing techniques driving today’s most popular Web 2.0 applications. Traditionally, computer science students wouldn’t be exposed to parallel programming until their graduate studies, if at all. “In order for us to enable the creation of these new applications,” he said, “there’s a paradigm shift that’s going on in the way people need to learn about how to develop software.”
Part of the reason for this lack of early-stage education (something Quan has been improving over the past year or two) is that students at most universities have access to only limited amounts of compute resources. With this project, however, students will have access to hundreds of processors (a number expected to grow substantially, up to 1,600 processors) located within the datacenters of both Google and IBM. It is important, noted Quan, to be able to write applications with massive parallelism in mind because all of the key elements of Web 2.0 sites, from hosting videos to doing visitor analysis for advertisers, require a high degree of access to resources. With Google’s expertise on this side of things combined with IBM’s experience in high-volume transaction processing and large-scale computing infrastructures, Quan believes students working with the project will get the best of both worlds.
When it comes to seeing tangible results from this project, Quan feels that we can expect to see some unique applications of the technology both sooner and later. While doing a guest lecture at the University of Washington (a member in the project’s current pilot phase) in March, Quan said students who “really had no exposure to this kind of technology in the past … were extremely eager to play with it — beyond the requirement to get the homework done,” adding that students will apply their imaginations to the technology and come up with some new implementations. Already at the University of Washington, the technology has been applied to areas from biology to spam detection. “Of course,” Quan predicted, “as they bring those skills into the workplace, we’re going to see applications becoming a lot more scalable.”
And, boy, will there be demand for individuals coming out of school with the ability to write to and work with these Web-scale architectures. In the same way the workforce was a few years ago (and for all I know still is) woefully short of employees knowledgeable in grid computing, in Quan’s estimation, employees with skills in what is being called “cloud computing” presently represent an “almost insignificant fraction of the workforce.” Because the vast majority of future applications likely will require these skills, it is critical that graduates leave college with them. “Right now,” said Quan, “they may be specialty skills; in the future, they will just be mainstream requirement skills.”
Obviously, companies like Google and IBM stand to benefit from employees who come pre-loaded with the skills necessary to develop Web 2.0 and beyond applications, but Quan said IBM also is hoping to use this project to learn how to apply the technology more broadly in the general computing and enterprise transaction processing spaces.
Interestingly, the official announcement (where you can read about the project’s technical details, such as how the students will be leveraging Hadoop, the open source implementation of Google’s MapReduce and Google File System) never mentions cloud computing, but that hasn’t stopped the term from appearing in the headlines of dozens of blog entries and IT articles. Quan describes cloud computing as being a complement to grid computing, focusing more on scalable, on-demand, Web 2.0-type applications, although the architecture certainly could handle a lot of the batch workloads traditionally associated with grid computing. Of course, our readers probably don’t need too much of an introduction to the concept …
Out in the greater world, you might have heard that Al Gore and the Intergovernmental Panel on Climate Change (IPCC) were awarded the Nobel Peace Prize last week. However, what you might not know is that grid computing, specifically the Earth System Grid (ESG) project, played a role in the work done by the IPCC. The ESG portal is used by scientists in 13 countries to publish and share climate simulation models and results. We’ll look to have more on the Nobel-ESG connection in the weeks to come.
As for this week’s issue, the following items should be of interest: “Gartner Identifies the Top 10 Strategic Technologies for 2008”; “From Aliens to Accelerators: @home Project Comes to the UK”; “INgrooves Adds Grid Capabilities to ONE Digital Platform”; “Nationwide 100 Gbps Internet2 Network Complete”; “Solix EDMS 4.0 Receives Validation on IBM Grid Platform”; “Univa UD’s Tuecke, Venkat to Lead Session at OGF21”; and “Oracle Proposes to Buy BEA Systems.”
—–
Comments about GRIDtoday are welcomed and encouraged. Write to me, Derrick Harris, at [email protected].