Cloudy Costs for Amazon’s New HPC Offering
Amazon has been open about its HPC instance pricing model, but to get more in-depth answers in context, I put a few questions to Business Development Manager at Amazon Web Services, Deepak Singh about how the pricing works in practical context. His fully developed answers might help us to shed some light on that ages-old question: to buy a cluster, to rent one, or to automatically spin up nodes as needed in a pay-as I go fashion?
Full story at HPC in the Cloud
Cloudy Costs for Amazon’s New HPC Offering
Unless you’ve been living in a cave, (and I am well aware that some of you find that environment to be most suitable for your personal productivity) you’ve probably already been pinged with the news that Amazon has proffered an EC2 instance specifically for the needs of HPC users. If you need a moment to blink, dazed by the light of day, refresh here via an article on the announcement.
Now that we’re moving beyond looking at this news outside of the surprise it presented to many at the beginning of this week, we’re seeing it in the light of what it means in the broader context. Accordingly, there are serious questions mounting about that most critical issue outside of performance — pricing.
Without any use cases yet on the Cluster Compute Instances front to draw from in order to examine the model as it relates to different applications or different use scenarios (cloud-bursting versus continuous utilization, etc.) there is rampant speculation that relies on employing hypothetical scenarios that yield some strange, scary arithmetic — scary because those costs still look pretty high for sustained use.
While the numbers involved with purchasing a cluster outright are enough to set any small startup research shop shaking, the solution has been to send the embarrassingly parallel stuff to EC2, since it is generally accommodating for such workloads or to turn to any number of firms offering clusters on a rental basis. The benefit of EC2 has been cost for bursting needs in particular and the benefit of the latter (rentable HPC — which is not to be confused with “cloud” by the way, no matter what their marketing managers tell you) has been the support that comes with having specialized HPC clusters run by specialists.
So even though it’s still easy to contend that in the long run, this announcement was a kick in the behind for the ecosystem of HPC cloud vendors and users alike, it still comes down to a cost question — specifically, one that revolves around buying, renting, or clouding. And there are certainly differences between those last two.
Amazon on the Dollars and Sense Behind Cluster Compute Instances
Amazon was open about the pricing to begin with, but to get more in-depth answers in context, I put a few questions to Business Development Manager at Amazon Web Services, Deepak Singh about how the pricing works in practical context. His fully developed answers might help us to shed some light on that ages-old question: to buy a cluster, to rent one, or to automatically spin up nodes as needed in a pay-as I go fashion? Well, it shed light on the questions, but it’s still hazy as to how pricing will be understood for individual users with wildly varying needs.
The CCI model itself is not difficult to understand, especially if one is already familiar with EC2 instance cost evaluation. It’s almost surprisingly uncomplicated on the surface, which alone prompted some speculation. Like any other EC2 instance, Cluster Compute Instances (CCI) run in an on-demand fashion by the hour. For CC Instances, the price is $1.60 per instance hour. Fair enough? I asked Deepak Singh to put this in practical context.
“Let’s say we have a 32-node cluster which is running a simulation task for 20 hours a week. Using the on-demand pricing, the customer would pay $1024 per run (not counting any additional storage costs). If the customer needs to do this run twice a month, they would end up paying $2,048 per month for their EC2 usage (or < $25K/yr)."
Singh continued with the statement that this is an attractive model for several HPC use cases that are driven by bursting versus sustained use. This is especially true because if there is a need to run applications one time in a single strong flurry, there would otherwise be the queue issue — waiting in line to get access to a shared resource with no guarantee about when the job would run. Conversely, it is a viable alternative to “acquiring the hardware, which means not only the capital costs of purchasing the hardware and the facilities in which the hardware is run, but also the operational cost of running the hardware 24-7 when you might need it for a fraction of the time.”
He also noted that there are often use cases in HPC that turn this on it head a bit, which is where the “package deal” comes into play. Such use cases would involve users who have highly-utilized systems, either for “a single use case, or in aggregate” and for these users, the Reserved Instance model for EC2 would be the best solution.With Reserved Instance a customer pays a low upfront price, which (a) provides a significantly lower hourly rate, and (b) guarantees availability of instances. It is important to note that purchasing an RI does not oblige a customer to pay anything more. It simply assures them they will have capacity and gives them access to a lower usage rate when they do run an instance.”
Singh breaks this down this mode by providing the following example:
With Amazon EC2 Cluster Compute Instances, to reserve the same 32 node cluster described above, the customer would buy either thirty-two 1-year reserved instances (at $4,290 each) or 32 3-year reserved instances (at $6,590 each) depending on how long they anticipated needing access to the cluster. In the case of a 3-year reserved instance, this would cost the customer $210,880 for all 32 Reserved Instances and entitle them to run up to 32 nodes at $0.56/hour compared to the On-Demand rate of $1.60/hour. If the customer used the cluster 24×7 for the whole 3 year term, their total cost would be $210,880 + (32 x 365 x 3 x 24 x $.56) = $681,817.60 over the three years. This equates to an effective hourly rate of $0.81/hour or about $0.10/core/hour. Of course, if the user ended up not needing their cluster all of the time, their total cost would be lower but their effective hourly rate would be higher. For example, if they buy the same 3 year reserved instances and only used the 70 percent of the time, their total cost would be $210,880 + (32 x 365 x 3 x 24 x $.56 x .70) = $540,536.32. This is an effective hourly rate of $.92/hour or $.115/core/hour.
When is CCI Better Than Buying A Cluster?
According to Deepak Singh, the AWS economics center is a good resource to understand the comparative costs of owning versus renting a cluster. He stated that “most people forget that running a cluster involves a lot more than the cost of the servers. Networking in HPC is expensive, and power too, even more so than servers for other tasks, so we believe that both for ad hoc clusters and dedicated clusters for your organization, Cluster Compute Instances provide a significant value proposition. Other cost-baring functions people need to understand include utilization, redundancy, supply chain, data center efficiency and personnel.”
That’s What Amazon Says, But…
The pricing issue is causing some consternation because no one has seen a viable use case to study from to see how, in an application-specific context, the costs play out over the course of a typical year—whether for bursting needs or for continuous use.
Some have expressed the statement, including Miha Ahronovitz, “I don’t think from reading the announcement that many people realize the high costs involved in placing an HPC data center on AWS.” In his assessment of pricing structures, it becomes clear that what might seem like attractive pricing might, in the long run, end up costing far more than researchers might be able to contend with. While they are avoiding the up-front costs of their own cluster, the amount spent over the long term adds up significantly, even in an organization utilizes one of Amazon’s package deals.
Penguin and SGI have higher per-core hour pricing along with support, which means that when the dust settles around this announcement, we will probably see ramped up efforts to make (especially first-time) HPC users see that Amazon is offering what amounts to an unsupported platform. With HPC as a Service, however, there is support and dedicated engineering staff who can manipulate at will to optimize application performance.
The race is now on for engaging those first-time HPC users more than ever because what Amazon is offering looks appealing because it’s a recognizable brand for those who aren’t embedded in the space already. Without a doubt, as Wolfgang Gentzsch noted this week in so many words, only time will tell to what degree this is substance and progress or something that is limited in its capability. While it’s a great start for HPC in the cloud as a concept becoming more of a reality, these next several months will prove interesting indeed.
The HPC as a Service folks should see that their pricing is communicated very clearly and furthermore that their support and expertise in this space is what makes what they peddle the most appealing, especially for those entry-level users who have a host of good choices to avoid an up-front cluster cost but don’t know where to start.
Because when we get right down to it, it’s the economics of cloud that’s the elephant in this room. Especially for HPC folks — newcomers and (no offense intended) old-timers alike.