HPC, Clouds & Big Data Converge at ISC Cloud 2012 – Part Two
The HPC cloud space is still a work in progress, but judging from a set of European conferences that took place this September, there is also actual progress to speak of. With GlobusEUROPE and EGI’s Technical Forum in Prague, Sept. 17-21, and ISC Cloud taking place in Mannheim, Germany, Sept. 24-25, there was an abundance of topics to cover. This article continues our coverage of the power-packed ISC Cloud event (read Part One here).
ISC Cloud, now in its third year, attracted nearly 150 participants with about 40 percent from academia, research or government spheres and the remaining 60 percent from industry. The various HPC cloud stakeholders were well-represented, which led to informed and engaged discussions both inside and outside the main presentation hall.
On the morning of day two, participants returned to the Dorint Congress Hotel in Mannheim for a series of talks on research clouds. Several more interesting sessions were also on the day’s agenda, including progress reports from four major HPC ISVs, a vendor showdown and three audience-selected Birds of a Feather sessions.
Moderated by Josh Simons of VMware, the panel on research clouds addressed whether “cloud computing is suitable for scientific computations and big data processing.” The academic and research community is seriously looking to cloud as a way to facilitate science, and there are numerous cloud testbeds currently underway. Helix Nebula is probably the most visible, but there are many smaller efforts across the US, Europe and beyond, working toward identifying use cases, gathering requirements, recording outcomes and establishing metrics.
In a talk called “Moving Beyond IT Outsourcing – Can Clouds Transform Science?” Manish Parashar of Rutgers University explored how the practice of science is being revolutionized by big compute and big data. While there are the obvious cloud candidates (loosely-coupled, nicely-parallel workloads with modest I/O requirements), Parashar is seeking to extend the boundary of cloud-suitability by using new application formulations and delivery models as well as hybrid usage models that combine HPC cloud and grid resources. He cited the CometCloud autonomic cloud-computing project as an example of such an integrated hybrid cloud infrastructure that is supporting science in an era of data-explosion. Parashar closed by calling on the community to combine the key strengths of HPC, grid and cloud in order to provide all these complementary benefits to users.
Next to the podium, David Wallom of the Oxford eResearch Centre delivered his talk on “Supporting research with flexible computational resources.” Wallom is part of the National Grid Service Agile Deployments Environments project, which is identifying use cases and gathering requirements toward the creation of a set of cloud services that are EC2 compatible and open source. Wallom echoed Parashar’s sentiments in saying “Cloud is part of the ecosystem, not the ecosystem,” but also concluded that “utilization of virtual infrastructure is the only scalable method to support [the] large number of disparate user communities.”
In his talk “Clouds and Security at Rutherford Appleton Laboratory,” Jens Jensen of Rutherford Appleton Laboratory, asked whether we “can we trust the cloud with our data,” noting that cloud’s lack of single-sign on is a “big deal.” He pointed to the work of StratusLab and the ability to build a marketplace for virtual machine images as positive steps. In his conclusion, Jensen stressed the importance of hybrid cloud and the need for more interoperation between clouds.
Frédéric Desprez, chief senior research scientist at INRIA, rounded out the research cloud segment by discussing “DIET, a scalable platform for clusters, grids and clouds.” Desprez placed the current technology in its historical context, noting that Internet computing and storage have evolved from isolated nodes into what are now called clouds. He recalled papers that were written about distributed computing going back to the 60s and 70s and brought up the point that there have been many incarnations of grids and clouds.
Next on the agenda was the ISV panel, “Engineering Clouds – Commercial Software in the Cloud.” In half-hour blocks, representatives from CD-Adapco, ANSYS, SUMULIA and ESI gave product overviews and were pretty frank in discussing the balancing act that is cloud licensing, wanting to enable user needs without cannibalizing main revenue streams. While the major software vendors are sometimes assigned blame for being too slow in embracing the cloud and for contributing to the “licensing roadblock,” they are operating under the usual business mandate to drive profit. While alternative (cloud-based) licensing paradigms are potentially disruptive to the business model, they also have the potential to generate new revenue. Slow or not, all of these vendors have some kind of cloud licensing model in play – yet another data point for HPC cloud’s growing relevancy.
There’s still somewhat of a chicken-and-egg problem when it comes to enabling software in the cloud through new licensing models, with the ISVs pointing to lack of user interest in cloud and the users pointing to lack of ISV support. That’s why events of this nature are so important to sparking discussion that in turn enables forward movement.
Taking this idea a step further, bringing key participants to the table to work out requirements and negotiate obstacles is the key mission behind the Uber Cloud Experiment, run by Wolfgang Gentzsch and Burak Yenier. The project brings together all the necessary stakeholders in order to deliver HPC resources to the underserved small-to-medium enterprise community, the so-called “missing middle.” About three months into the experiment the founders released a half-time report, which was covered in HPCwire last week. Current participants have voted overwhelmingly to extend the experiment and it’s been announced that Part Two will run from mid-November to mid-February.
Also included in the vendor panel (although removed thematically from the software licensing subject) was VMware’s resident HPC expert Josh Simons, delivering a talk on “HPC Performance in the Cloud: Status and Future Prospects.” Simons proposed bringing the benefits of cloud computing (in his VMware worldview, a virtualized cloud) to a wider range of HPC applications – making the point that the model of virtualization as this substantial layer between the application and the infrastructure is incorrect. Furthermore, when it comes to performance slowdowns caused by virtualization, the commonly-held assumptions and working figures are outdated, Simons noted.
This year’s ISC Cloud agenda was driven by a topic trifecta, the confluence of HPC, cloud and big data. The vendor showdown, planned and moderated by Intersect 360 analyst Addison Snell, provided a forum to explore these themes and to showcase vendors in a non-traditional, hopefully more interesting, way. Each of the representatives was allowed to introduce their company using only two PowerPoint slides, after which a series of questions was asked by Moderator Snell with a panel of three judges assigning a point to the best response. There were seven questions in all and three judges, leaving a total of 21 points up for grabs.
The idea that digital technology can confer advantages such as innovation and economic competitiveness was a main thrust of both this panel and the entire conference. On the one hand, the democratizing effect promises increased computational power to groups that have traditionally been underserved, but it also lowers the bar to entry, bringing digital tools to a brand-new community of users. When looked at through the lens of this kind of paradigm-changing potentiality, the HPC/cloud/big data trio could be the tide-booster that raises all ships.
Asked whether HPC was becoming easier, Bright Computing’s Matthijs van Leeuwen made the point that it better be or we’re not doing our job very well. SGI’s Tony DeVarco added that making products easier to use, for example by creating a portal with a drop-down menu, can be the key to increased adoption. Perhaps the most contentious comment of the show goes to Mellanox rep Eli Karpilovski, who stated that “big data is not a big deal in HPC because it’s been known and used for many years.” Snell disagreed and pegged the comment as dismissive of the many companies coming to big data for the first time. “They have applications that are different from the HPC applications that have been solved before,” he added.
All in all it was a very close race between Team Donner (comprised of HP’s Philippe Trautmann, IBM’s Chris Porter, Adaptec’s Alfred Berger, Samsung’s Peyman Blumstengel, and SGI’s Tony DeVarco) and Team Blitzen (with Intel’s Ullrich Becker-Lemgau, Bright Computing’s Matthijs van Leeuwen, Mellanox’s Eli Karpilovski, T-Systems’ Raik Dittrich, and Bull’s Olivier David). The contest was judged by Rolf Sperber of Alcatel-Lucent, Harald Kornmayer of DHBW Mannheim and yours truly representing HPC in the Cloud. Before the final question was asked, the score was Donner, 10, and Blitzen, 8. Blitzen needed the final three points to take the win, but Donner team’s response to a question about big data was more popular with the judges, earning Donner the win.
The vendor showdown signaled the end of the main conference session, but there was still one more item on the agenda, the BoFs. The session topics were selected on-the-fly this year, allowing participants to suggest and vote on the topics that mattered most to them. While security came up as a key concern during the two-day conference, the single biggest recurring theme was data movement – a subject which applies as well to big data as it does to cloud. So it was little surprise that “Data Transfer in/out of Clouds” was the most popular BoF topic, followed closely by “HPC Cloud Reference Architectures” and “Applications and software in the cloud.” An outline of each group’s findings is available here.
Sometimes it’s difficult to tell if the HPC cloud space is making progress. We’re well versed with cloud’s sweet spot for embarrassingly-parallel workloads and its proscriptions on latency- and throughput-sensitive applications, but as we learned at the event, people are actively pushing to expand these boundaries. Community gatherings like ISC Cloud provide a point of perspective and an opportunity to gauge forward momentum. Comparing this year to last, it seems we may finally be done with the lengthy “definition” phase – making way for the testing and adoption stretch. The community has come to a consensus that “cloud” (in a broad sense) has value for HPC and is putting more of its resources into understanding and mining that value.