Why the Cloud is Ideal for HPC
Cloud computing has emerged as a model to address a broad range of computing needs and promises to solve all but world peace. The idea of utility or on-demand computing is hardly new but the business models and technology have matured sufficiently to propel the concept firmly back into the limelight. High Performance Computing (HPC) is where most progressive businesses should be focusing their Big Data and Big Compute efforts.
Unfortunately, some companies pigeon hole HPC as scientists geeking out with massive supercomputers. However, there is a far more significant market evolving outside of traditional HPC with growing computational needs. Big Data and Big Compute is an excellent example where the opportunity for deep analytical insight is an extremely attractive proposition to a wide range of vertical markets. This new segment of the market is often referred to as the “Missing Middle,” and disruptive technology solutions such as those offered by GreenButton put supercomputing power within progressive companies’ reach.
With effective job management as demonstrated below, thousands of HPC activities can be managed by companies such as GreenButton though innovative technology and simple to use web interfaces.
Keep in mind, when we talk about HPC in the cloud, it’s important to remember that cloud computing is ultimately a business model, not a technology. However, there are a common set of technical capabilities (e.g. virtualization) that are realized in the cloud, and these, as well as the business model provide certain benefits or challenges to HPC applications.
CFO’s Pay Attention to this!!
Cost: For spikey workloads, the cost savings in the cloud can be significant. What’s more, the cost is an Operational Expense rather than Cap-Ex so is often more palatable for many businesses as costs can be attributed to a particular project.
Ease of use: The cloud can make dynamic provisioning of specified workloads very easy. The ability to have OS/Software configurations particular to a workload is a key advantage.
Speed of deployment: The ability to rapidly provision new environments/clusters in minutes is incredibly valuable to many businesses.
Scalability: Elastically scaling out to meet increased capacity demands is a powerful concept. The public cloud promises “infinite” scale. The reality is somewhat different: there are some real limits even in the cloud. But the computing capacity that you can get from large providers such as AWS and Azure is far greater than what most customers can fathom with internal hardware. Some companies – such as GreenButton work with a variety of cloud providers such as Windows Azure, HP Cloud Services, OpenStack, Amazon Web Services, and VMware for global access to resources.
Resiliency: The ability to snapshot workloads as they are running can allow for check-pointing of MPI workloads. Combine this with active monitoring and the ability to dynamically move a guest VM from one physical host to another, and your workloads can keep running even in the face of hardware failure.
Portability: The ability to move a workload from one cloud platform to another on the fly without any application changes presents powerful options such as for bursting from a private cloud out to a public cloud, High Availability where a workload is run on multiple clouds simultaneously, scaling across multiple clouds to meet extremely high resource requirements or, to take advantage of shifts in the spot pricing market.
Security: this remains a significant barrier to adoption today, but the issue is primarily in trust and perception rather than real limitations of the cloud platforms. One could argue that in some cases your data is safer in the hands of Amazon or Microsoft than your own data center. That said, data isn’t sufficiently secure by default so some effort commensurate to the sensitivity and risks needs to be applied. For example;
- Encryption at rest of cloud-bound data.
- Limiting the time window that data is resident in the cloud.
- Anonymizing data. A great example is running risk models for the financial services sector where sensitive customer data can easily be stripped out prior to sending to the cloud.
Performance: There is no single answer to the question of performance, though in general the cloud offers massive performance gains in most cases (and therefore is generally a positive), this does depend on the workload in question and presents some challenges today.
Some workloads scale in a linear fashion i.e. embarrassingly parallel, and these scale extremely well to the cloud. Even many MPI workloads scale perfectly well on cloud infrastructure.
However, I/O bound MPI processes will often run into performance challenges due to their heavy demands on network infrastructure or sensitivity to latency. Many traditional HPC applications are tuned for very low-latency Infiniband interconnects and take advantage of RDMA technology. These applications just won’t scale on the 10 GigE networks within the cloud. This will change as cloud providers roll out Infiniband or RDMA over Ethernet but for the time being remains an issue.
Other challenges lie in certain cloud platforms intentionally distributing your deployed instances across the data center to increase availability. This can negatively affect performance through increased latency. But this is increasingly being rectified with increased control over physical placement of VMs – e.g. AWS Placement Groups.
I’m not going to dwell on the overhead of virtualization as there is a lot of material on the web covering this topic. I will say that modern virtualization technologies have such a small overhead on CPU performance today that it is effectively negligible. The I/O hit in some cases can be more noticeable but this depends on the characteristics of the workload. Josh Simons of VMware has posted extensively about this so check out his posts at http://cto.vmware.com/author/joshsimons/
Management: One of the challenges when spreading workloads across more than one platform is management of the workloads and resources being utilized. Being able to consolidate management within a single tool becomes critical for effective use of the cloud.
Data: Moving large datasets to the cloud still presents some challenges. In the Oil & Gas sector we physically ship 50TB+ to AWS where it undergoes weeks/months of processing, and the entire workflow lives in the cloud using visualization technology. RenderMan workloads also present challenges with large datasets (up to 1GB per frame). There are also technologies such as Aspera or GreenButton’s own CloudSync which optimize throughput over the internet.
Managing Costs: There is obviously some level of fear when moving from a known and understood capital expenditure model to one of pay-by-the-drink where costs could spiral out of control. Trust me, this has many CFOs awake at night in a cold sweat. At GreenButton, we’ve addressed this by predicting job execution time and committing to users on runtime and cost. We also support cost monitoring and chargebacks down to the departmental or user level so the CFO never has to get any nasty surprises!
Cloud Lock-in: Different cloud vendors have different APIs and deployment mechanisms, so you may be concerned about being locked into a particular cloud and being unable to take advantage of improved pricing or services becoming available in other clouds. I’ve written about how to avoid cloud lock-in before so I won’t repeat it here!
One advantage of moving HPC workloads to on-demand virtualized infrastructure is that Enterprise customers can take advantage of internal hardware investment in the form of a private cloud. The private cloud obviously solves some of the issues around security and data transfers, at the cost of limited capacity. But throw in the ability to seamlessly burst to nominated public clouds and you have something pretty compelling indeed. Below is an example of how this can be implemented effectively.
Not only is the cloud an ideal platform for many HPC (and non-“HPC”) workloads today, but current limitations are constantly being whittled away by the platform providers themselves or by software vendors such as GreenButton. There is a common perception that HPC is so complex and expensive that ordinary businesses are not able to tap into the massive benefits and business value that can be obtained. With the advent of the cloud HPC is accessible and affordable to the mass market for any type of application. Do your research to find the solutions that work best for you!
Dave Fellow, CTO, GreenButton™
Dave Fellows is the Chief Technology Officer of GreenButton ™ Limited. Dave has extensive experience designing massively scalable PaaS applications in a variety of technology industries. He has a passion for the Cloud and High Performance Computing (HPC) and creating innovative technologies to bring unique and compelling solutions to GreenButton’s global customers.
Feeds by Topic
- Developer Tools
Feeds by Industry
May 27, 2016
- Brocade Completes Acquisition of Ruckus Wireless
- SGI to Participate in D.A. Davidson 8th Annual Technology Forum
May 26, 2016
- IU Receives NSF Grant to Develop Tool to Measure Impact of Campus-Based Cyberinfrastructure
- UMass Dartmouth Hosts Massachusetts-Wide HPC Day
- Details Announced for Teratec Forum Workshop
- Theoretical Computer Scientist Maria Zuber Elected Chair of National Science Board
May 25, 2016
- Argonne’s New Tech Incubator Includes Access to Mira Supercomputer
- Bioinformatics Institute Sees Advancement in Research With Help of TACC
- Sri Sathya Sai Institute in India Chooses Bright Technology
- Call for Participation Issued for Women in IT Networking at SC16
- AMD Introduces the FirePro S7100X GPU for Blade Servers
- New HPC Market Report Released by ReportBuyer
- Ron Brachman Joins Jacobs Technion-Cornell Institute as New Director
- ASRock Rack to Showcase New Server Products at Computex 2016
May 24, 2016
- Cray Unveils the Urika-GX System
- Ace Computers Rolls Out Big Data HPC Clusters for the Military and Government
- STULZ USA Announces Partnership With CoolIT Systems
- ORNL to Open Chattanooga Office
- Baidu, Inspur Build Heterogeneous Acceleration Platform to Facilitate Development of Driverless Cars
May 23, 2016
Most Read Features
- Japan Unveils Details of 25 PFLOPS Machine to be Operational in December 2016
- China Sets Ambitious Goal to Reach Exascale by 2020
- Intel Debuts ‘Knights Landing’ Ninja Developer Platform
- IBM Puts 3D XPoint on Notice with 3 Bits/Cell PCM Breakthrough
- Météo-France Fires Up Bull Supercomputer Running on ‘Broadwell’ Processors
- Chameleon: Why Computer Scientists Need a Cloud of Their Own
- ORNL Researchers Create Framework for Easier, Effective FPGA Programming
- Microsoft Puts GPU Boosters on Azure Cloud
- Nielsen and Intel Migrate HPC Efficiency and Data Analytics to Big Data
- GPU-based Deep Learning Enhances Drug Discovery Says Startup
- More Features…
Most Read Short Takes
- CPU Benchmarking: Haswell Versus POWER8
- Intel Weighs In on NSCI
- Barcelona Supercomputing Center Develops New Bioinformatics Tool Against HIV
- Is Chinese 100-Petaflopper Around the Corner?
- National Academy Offers Guidance to NSF on Advanced Computing Priorities
- IDC Server Report: China Surges; IBM Power Strengthens; ARM Stumbles
- India Readies First of 70-Plus Supercomputers for 2017
- Secretive AI Startup Pioneers Artificial Imagination
- ASC16 Student Supercomputer Challenge Results Are In
- A New Purpose for Old Smartphones: Cluster Computing
- More Short Takes…
Most Read Off The Wire
- Sri Sathya Sai Institute in India Chooses Bright Technology
- TOP500 and Green500 Merge
- IBM Makes Quantum Computing Available on IBM Cloud to Accelerate Innovation
- Supercomputing Helping Clean Up Waste From WWII
- AMD Introduces the FirePro S7100X GPU for Blade Servers
- Cray Reports First Quarter 2016 Financial Results
- IBM to Offer NVIDIA Tesla M60 GPU Accelerator in the Cloud
- SDSC Researchers Publish Book on Using GPU Accelerators for Nanosciences
- Edico Genome to Make DRAGEN Available on New IBM Power Systems for HPC
- Mercury Systems Selected as Platinum Partner in Intel’s FPGA Design Solutions Network
- More Off The Wire…
“The fact is that Intel has aggressively decided to build a new kind of processor, which includes a number of advances that make it not only the most scalable processor the world has ever seen, but also one of the most versatile, configurable and highly integrated CPUs ever devised,” explains James Reinders as he describes the upcoming Intel® Xeon Phi™ processor. Read more…
- Read more…
- Read more…
In the influx of deep learning startups, one stealth mode venture, Vicarious, has already made something of a name for itself. Read more…
Here at HPCwire, we aim to keep the HPC community apprised of the most relevant and interesting news items that get tweeted throughout the week. Read more…
Cycle Computing |
Whether your organization is involved in exploring new research frontiers or catering to customer demand, cloud solutions play a critical role in scaling out and staying ahead of our toughest compute problems, and Cycle Computing is helping to lead the charge. Read more…
Avere, Cray, DDN, Seagate |
Whether they’re supporting business goals or research efforts, tools such as modeling, simulation and analysis are critical to today’s leading organizations. Read more…
Join us for a webinar exploring the pressures CIOs face in harnessing the exponential growth in data with the realities of today’s power grids. Read more…
The Van Andel Institute (VAI) worked with Silicon Mechanics to not only provide its users a more powerful platform, but a hybrid OpenStack HPC solution with the flexibility to support VAI’s commitment to improve the health and change the lives of current and future generations. Read more…
HPC Job Bank
June 1 - June 2Denver CO United States
June 19 - June 23Frankfurt Hessen Germany
June 27 - July 1Cetraro Italy
June 28 - June 29Palaiseau cedex France
July 17 - July 21Miami FL United States
August 22 - August 24Toronto ON Canada