This week I have been reaching out to some members of the HPC and cloud community to get their take on what is in store for the arena in the coming years. I have been particularly interested in hearing about what challenges they perceive now and what will happen in the future to help us overcome those issues.
Today’s entry contains some feedback from Ignacio M. Llorente, Ph.D in Computer Science (UCM) and Executive MBA (IE Business School), who is a Full Professor in Computer Architecture and Technology, and the Head of the Distributed Systems Architecture Research Group at Complutense University of Madrid.
Dr. Llorente is no stranger to HPC and cloud–he has 18 years of experience in research and development of advanced distributed computing and virtualization technologies, architecture of large-scale distributed infrastructures and resource provisioning platforms, and management of international projects and initiatives on Grid and Cloud Computing. His current research interests are mainly in the area of Infrastructure-as-a-Service (IaaS) Cloud Computing, co-leading the research and development of the OpenNebula Toolkit for Cloud Computing and coordinating the Activity on Management of Virtual Execution Environments in the RESERVOIR Project, main EU-funded research initiative in virtualized infrastructures and cloud computing. He founded and co-chaired the Open Grid Forum Working Group on Open Cloud Computing Interface; and participates in the European Cloud Computing Group of Experts.
Below are some of Dr. Llorente’s thoughts on the present and future of the alliance between cloud and HPC that are based on a series of open-ended questions on the topic.
What impact will cloud have on HPC over the next 5 years?
Cloud technologies and services, especially virtualization, will optimize and simplify the operation of High Performance Computing infrastructures. Using private Cloud technologies for resource provisioning would enhance failover and redundancy solutions, and permit machine migration for flexible load balancing and energy efficiency. Virtualization of the site infrastructure would also allow the dynamic provisioning of worker nodes to address the demands of different user communities, and the execution of virtualized HPC clusters with elastic capacity in terms of the number of nodes and their capacity. Additionally, using Hybrid Cloud technologies would support “elastic” sites ability to expand available computing resources in the local Cloud to meet peak demands using remote Cloud providers.
What is the single greatest barrier to cloud adoption in HPC?
Virtualization Overhead. Although virtualization provides several benefits as vehicle for resource provisioning, it also introduces additional challenges related to management and processing overhead. Virtualized memory and CPU-bounded applications in HPC environments perform at near native performance. However I/O intensive or latency sensitive applications still tend to have some overhead compared to running on raw physical hardware. I expect in the coming years an intensive research effort to overcome the current performance limitations of virtualization platforms. Hypervisor support for specialized communication transports such as Infiniband/Quadrics/Myrinet is another related issue to be addressed.
This barrier has already been broken by research centers such as CERN, which has used OpenNebula to deploy as many as 7,500 VMs on thousands of cores, see http://lists.opennebula.org/pipermail/users-opennebula.org/2010-April/001886.html
Is this hype or a genuine paradigm shift for HPC?
It is a paradigm shift for HPC. Cloud computing will not only optimize the operation of HPC platforms but also enable the on-demand provision of resources to define virtualized HPC platforms. The cloud computing model would also allow existing computing centres to offer infrastructure for computing on demand, creating HPC Clouds where provision of high performance infrastructure as a service could be offered to external users interested in dynamic scaling their local infrastructure or temporally having a platform for testing, training or development. Existing commercial clouds offer access to loosely coupled clusters without data locality management, so not providing an efficient framework for tightly coupled HPC applications. These new HPC Clouds will offer platforms with HPC devices whose configurations, capabilities and capacities could be customized by users.
See for example the new HPC Cloud operated by SARA (https://grid.sara.nl/wiki/index.php/Using_the_HPC_Cloud/betaevaluation), it runs the widely used OpenNebula open-source toolkit for cloud computing to orchestrate the complexity of high performance large-scale distributed infrastructures.
What are HPC customers most worried about in terms of cloud adoption and what is most appealing for cloud consumers in the realm of HPC?
In my view, the adoption of cloud computing and virtualization to optimize HPC infrastructures or to create HPC Clouds should be fully transparent to end users of the HPC service.