On Wednesday, Platform Computing released Platform ISF Adaptive Cluster, a private cloud management product that can dynamically reprovision the operating environment of HPC clusters. The company also announced a new “cloud bursting” feature for ISF that allows applications running on local machines to transparently grab CPU cycles on external clouds. The new offerings are designed to maximize high performance computing resources via the cloud paradigm, and signal Platform’s further embrace of the cloud management business.
Martin Harris, Platform’s director of product management, says the goal of ISF Adaptive Cluster is to pair workload awareness (via Platform LSF, Symphony, or a third-party workload manager) with available cluster resources, while ensuring that service level agreements are met. According to Harris, today a lot of Platform customers end up deploying multiple HPC clusters — one for each application — in order to guarantee a 100 percent service level. The problem with that arrangement, he says, is that the cluster utilization tends to top out at about 40 to 50 percent. By dynamically creating virtual clusters for different application environments, they believe customers will be able to consolidate server infrastructure and drive utilization up toward 80 to 90 percent. “The whole idea of what we’re providing is a private cloud for HPC,” says Harris.
The way this works is that ISF Adaptive Cluster provisions an application’s software environment on some number of server nodes at runtime based on workload demands. From the users’ point of view, this cloud management scheme allows them to “play Tetris with their workloads across the available resource pools,” says Harris, which certainly sounds more entertaining than “dynamic provisioning.”
If both Linux and Windows applications are in the mix, a dual-boot server setup is commonly employed to provision the OS, along with the rest of the application stack. Platform has entered into a collaboration with Microsoft to integrate this OS duality into the new ISF offering. The general idea here is to give applications the same environment they would get if they ran on statically-configured clusters, so as not to detract from performance — obviously a big consideration for HPC codes.
Alternatively, the application and operating environment can be wrapped in a virtualized container, a la VMware or some other virtual machine maker, and be executed on whatever native OS is in place. Performance is likely to suffer in this model, which is why running HPC applications in VMs is still relatively rare. But when you need the extra capacity or don’t have the dual-boot setup, even slow cycles are preferable to none at all.
One way VMs are useful for performance codes is when you want to harvest local non-HPC servers. Harris says a Platform customer in Europe has encapsulated Platform LSF and Symphony workloads into VMware and Citrix VMs in order to grab unused cycles from their Virtual Desktop Infrastructure (VDI) server farms at night, when presumably most desktop users are asleep. But the most common VM case for HPC is when users want to “burst” into a public cloud, for example, run some slice of an application as a virtualized image on Amazon EC2.
The cloud bursting feature just added to the ISF core technology will serve to make this process more transparent, and opens the door to hybrid clouds for Platform customers. The impetus behind this feature is to help customers whose budget prevents them from adding in-house compute capacity, allowing them to tap into third-party clouds for peak loads. Harris says over the past year, they have seen the demand for renting cycles on external clouds rise and are observing this trend across all HPC verticals. An EDA customer of theirs is cloud bursting to Amazon EC2 to get extra capacity for their electronics design work.
Although EC2 currently seems to be the platform of choice for cloud bursting, Platform is also engaged with IBM, HP and other hosting providers. And based on Platform’s newfound partnership with Microsoft, an Azure integration may not be too far off.
Platform is not alone in extending its core HPC management technologies into the cloud computing realm. Univa UD has embraced the delivery model as well, and has evolved its product line accordingly. Its UniCloud offering was one of the first commercial products to support private and hybrid cloud models for HPC. Adaptive Computing offers a similar capability with its Moab Adaptive HPC Suite, but for the time being has refrained from branding it as a cloud solution. Other vendors with cluster and grid management offerings are likely to join the fray as HPC cloud adoption gains momentum.