As high-performance computing continues to expand into a widening-array of spheres, it’s important to explore all avenues for workload execution, including cloud-based platforms. For engineers looking to leverage high-performance computing, the accessibility of a cloud-based approach is a powerful draw, but there are costs that may not be readily apparent, according to a recent article from Desktop Engineering.
Engineers tend to shy away from terms like TCO (total cost of ownership) and ROI (return on investment) as they are applied to technology solutions. What most engineers want, according to the article’s author Frank J. Ohlhorst, are the resources that enable them to complete their work in an efficient and accurate manner.
So what’s a company to do? There are two main choices, which basically boil down to renting or buying. They can lease a cloud-based HPC-as-a-Service solution or purchase an in-house system. There are other variations on these themes, such as rented machines that you control yourself (a la Penguin Computing), and managed hosting of systems that you can own outright (that are located either on or off-premise).
Ohlhorst writes that “Making sense of the value of HPC solutions provided by cloud service providers can be a complex undertaking. After all, the billing mechanisms can be multifaceted, including charges for provisioning, CPU time, support services and so on. Yet, at first glance, the initial costs can be quite attractive.”
Getting the solution’s total cost is important and helps in comparison exercises. Generally-speaking, the cloud is a good choice for projects that require temporary increases in computing power. For steady, predictable workloads, owning your own HPC system will be cheaper in the long run.
Another approach is to institute a mixed solution that leverages both in-house and rented infrastructure. The in-house cluster or supercomputer can be the primary workhorse and the HPC cloud will take care of additional workloads. Through careful usage monitoring and optimum balance can be achieved. Businesses may also want to consider migrating non-HPC workloads to the cloud to reduce their overall operational expense.
Determining value is challenging because the HPC and cloud markets are in a continual state of flux. Hardware (or services) get more powerful; and prices continue to drop. Ohlhorst notes that IT staffing levels, in-house expertise and administrative chores will also need to be factored into the decision.
The article does an excellent job of laying out the many variables to this decision-making process. Amid the hype are real benefits that extend the ROI of HPC operations, including scalability, elasticity, simplified self-service and baked-in show-back and chargeback mechanisms. But there are also risks and other drawbacks, such as the potential for vendor lock-in, the expense and or time constraint of moving data in and out of the cloud, and various security, compliance and SLA constraints.
And then there is an even bigger risk that was not addressed: what if the cloud service provider goes out of business? Does this scenario pose less of a risk for answer-oriented HPC workloads than for mission-critical business applications that are intended to run indefinitely?