Comprehensively Evaluating HPC Cloud Cost Benefits

By Ian Armas Foster

July 30, 2013

HP Labs partnered with the University of Illinois at Champaign-Urbana to comprehensively evaluate the feasibility of running high performance applications in the cloud. The research set out to answer many questions, including wondering how HPC applications fare in the cloud versus supercomputers (they used the Ranger and Taub machines for those tests), which applications were best suited for cloud deployment, and what the cost benefits were for certain organizations in maintaining their high performance needs in a cloud.

Below is a grid of all the platforms they used in testing their various applications. As one can see, the Ranger and Taub systems are there along with public and private cloud instances.

It is important to note the approach the research team took with setting up their cloud systems. While they could have built a dedicated instance that would perform closer to supercomputing standards, they figured that such an instance would be unlikely in the scenario of a mid-sized enterprise or startup looking to purchase on-demand HPC resources.

With that said, they still took steps to optimize the performance. “To get maximum performance from virtual machines, we avoided any sharing of physical cores between virtual cores. In case of cloud, most common deployment of multi-tenancy is not sharing individual physical cores, but rather done at the node, or even coarser level. This is even more true with increasing number of cores per server.”

They tested those cloud systems and the control supercomputers on a variety of applications, including Jacobi2D, used for scientific simulation and image processing, NAMD, a molecular dynamics application, ChaNGa, used for cosmology simulation, and the NQueens problem among others.

The graphs above show how well the various machines’ performance scaled relative to the various applications. The applications that reportedly found trouble scaling were those that were communication intensive. “IS is a communication intensive benchmark and involves data reshuffling and permutation operations for sorting. Sweep3Dalso exhibits poor weak scaling after 4–8 cores on cloud. Other communication intensive applications such as LU, NAMD and ChaNGa also stop scaling on private cloud around 32 cores,” the report noted.

In all instances except for the public cloud, the EP, Jacobi2D and NQueens applications scaled up to 256 cores, while the public cloud imposed performance penalties once more than four cores were used.

Once the performance drop off was established for clouds, a fact that was altogether not surprising, the next task was to determine exactly what kind of penalty was suffered, such that they could relate that to the cost of apportioning those systems in the process of determining if cloud is indeed a cost effective means of securing HPC resources.

To quantify the amount of variability on cloud and compare it with a supercomputer, we calculated the coefficient of variation (standard deviation/mean) for execution time of ChaNGa across 5 executions,” the report stated. According to the research team, the amount of variability increases as they scale up as a result of decrease in granularity. “For the case of 256 cores at public cloud, standard deviation is equal to half the mean, implying that on average, values are spread out between 0.5x mean and 1.5x mean resulting in low predictability of performance across runs. In contrast, private cloud shows less variability.”

Overall, latency and bandwidth on cloud ended up coming in a couple of orders of magnitude below that of their Ranger and Taub machines, as shown in the logarithmic graphs below.

These bandwidth and latency issues make it difficult on those aforementioned communication intensive applications, where obviously contact among cores and nodes to complete a problem is key.

Again, the researchers note that a dedicated public cloud instance would solve a great deal of these problems. However, such an instance would likely cost more and therefore become less feasible for the mid-sized companies and startups that would utilize it. The multi-tenancy cloud setup renders many high performance applications untenable. “The performance of many HPC applications is very sensitive to the interconnect, as we showed in our experimental evaluation. In particular low latency requirements are typical for the HPC applications that incur substantial communication. This is in contrast with the commodity Ethernet network (1Gbps today moving to 10Gbps) typically deployed in cloud infrastructure,” the report noted.

With that said, it is still prudent for those smallmedium companies to enlist cloud-based HPC services, as the cost analysis shows below.

Even the communication intensive applications work well up to a certain amount of cores, an amount of cores unlikely to be exceeded by a medium institution. “The ability to take advantage of a large variety of different architectures (with different interconnects, processor types, memory sizes, etc.) can result in better utilization at global scale, compared to the limited choices available in any individual organization,” the report argued. Below is a sample of what such an architecture that relies on just four-core cloud-based machines would look like.

The report does go on to say that dedicated instances would be advantageous to large institutions looking for burst capacity, a concept that has been discussed here.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

SC22 Unveils ACM Gordon Bell Prize Finalists

August 12, 2022

Courtesy of the schedule for the SC22 conference, we now have our first glimpse at the finalists for this year’s coveted Gordon Bell Prize. The Gordon Bell Prize, of course, comes with an award of $10,000 courtesy of H Read more…

Q&A with ORNL’s Bronson Messer, an HPCwire Person to Watch in 2022

August 12, 2022

HPCwire presents our interview with Bronson Messer, distinguished scientist and director of Science at the Oak Ridge Leadership Computing Facility (OLCF), ORNL, and an HPCwire 2022 Person to Watch. Messer recaps ORNL's journey to exascale and sheds light on how all the pieces line up to support the all-important science. Also covered are the role... Read more…

TACC Simulations Probe the First Days of Stars, Black Holes

August 12, 2022

The stunning images produced by the James Webb Space Telescope and recent supercomputer-enabled black hole imaging efforts have brought the early days of the universe quite literally into sharp focus. Researchers from th Read more…

Google Program to Free Chips Boosts University Semiconductor Design

August 11, 2022

A Google-led program to design and manufacture chips for free is becoming popular among researchers and computer enthusiasts. The search giant's open silicon program is providing the tools for anyone to design chips, which then get manufactured. Google foots the entire bill, from a chip's conception to delivery of the final product in a user's hand. Google's... Read more…

Argonne Deploys Polaris Supercomputer for Science in Advance of Aurora

August 9, 2022

Argonne National Laboratory has made its newest supercomputer, Polaris, available for scientific research. The system, which ranked 14th on the most recent Top500 list, is serving as a testbed for the exascale Aurora system slated for delivery in the coming months. The HPE-built Polaris system (pictured in the header) consists of 560 nodes... Read more…

AWS Solution Channel

Shutterstock 1519171757

Running large-scale CFD fire simulations on AWS for Amazon.com

This post was contributed by Matt Broadfoot, Senior Fire Strategy Manager at Amazon Design and Construction, and Antonio Cennamo ProServe Customer Practice Manager, Colin Bridger Principal HPC GTM Specialist, Grigorios Pikoulas ProServe Strategic Program Leader, Neil Ashton Principal, Computational Engineering Product Strategy, Roberto Medar, ProServe HPC Consultant, Taiwo Abioye ProServe Security Consultant, Talib Mahouari ProServe Engagement Manager at AWS. Read more…

Microsoft/NVIDIA Solution Channel

Shutterstock 1689646429

Gain a Competitive Edge using Cloud-Based, GPU-Accelerated AI KYC Recommender Systems

Financial services organizations face increased competition for customers from technologies such as FinTechs, mobile banking applications, and online payment systems. To meet this challenge, it is important for organizations to have a deep understanding of their customers. Read more…

US CHIPS and Science Act Signed Into Law

August 9, 2022

Just a few days after it was passed in the Senate, the U.S. CHIPS and Science Act has been signed into law by President Biden. In a ceremony today, Biden signed and lauded the ambitious piece of legislation, which over the course of the legislative process broadened to include hundreds of billions in additional science and technology spending. He was flanked by Speaker... Read more…

Q&A with ORNL’s Bronson Messer, an HPCwire Person to Watch in 2022

August 12, 2022

HPCwire presents our interview with Bronson Messer, distinguished scientist and director of Science at the Oak Ridge Leadership Computing Facility (OLCF), ORNL, and an HPCwire 2022 Person to Watch. Messer recaps ORNL's journey to exascale and sheds light on how all the pieces line up to support the all-important science. Also covered are the role... Read more…

Google Program to Free Chips Boosts University Semiconductor Design

August 11, 2022

A Google-led program to design and manufacture chips for free is becoming popular among researchers and computer enthusiasts. The search giant's open silicon program is providing the tools for anyone to design chips, which then get manufactured. Google foots the entire bill, from a chip's conception to delivery of the final product in a user's hand. Google's... Read more…

Argonne Deploys Polaris Supercomputer for Science in Advance of Aurora

August 9, 2022

Argonne National Laboratory has made its newest supercomputer, Polaris, available for scientific research. The system, which ranked 14th on the most recent Top500 list, is serving as a testbed for the exascale Aurora system slated for delivery in the coming months. The HPE-built Polaris system (pictured in the header) consists of 560 nodes... Read more…

US CHIPS and Science Act Signed Into Law

August 9, 2022

Just a few days after it was passed in the Senate, the U.S. CHIPS and Science Act has been signed into law by President Biden. In a ceremony today, Biden signed and lauded the ambitious piece of legislation, which over the course of the legislative process broadened to include hundreds of billions in additional science and technology spending. He was flanked by Speaker... Read more…

12 Midwestern Universities Team to Boost Semiconductor Supply Chain

August 8, 2022

The combined stressors of Covid-19 and the invasion of Ukraine have sent every major nation scrambling to reinforce its mission-critical supply chains – including and in particular the semiconductor supply chain. In the U.S. – which, like much of the world, relies on Asia for its semiconductors – those efforts have taken shape through the recently... Read more…

Quantum Pioneer D-Wave Rings NYSE Bell, Begins Life as Public Company

August 8, 2022

D-Wave Systems, one of the early quantum computing pioneers, has completed its SPAC deal to go public. Its merger with DPCM Capital was completed last Friday, and today, D-Wave management rang the bell on the New York Stock Exchange. It is now trading under two ticker symbols – QBTS and QBTS WS (warrant shares), respectively. Welcome to the public... Read more…

Supercomputer Models Explosives Critical for Nuclear Weapons

August 6, 2022

Lawrence Livermore National Laboratory (LLNL) is one of the laboratories that operates under the auspices of the National Nuclear Security Administration (NNSA), which manages the United States’ stockpile of nuclear weapons. Amid major efforts to modernize that stockpile, LLNL has announced that researchers from its own Energetic Materials Center... Read more…

SEA Changes: How EuroHPC Is Preparing for Exascale

August 5, 2022

Back in June, the EuroHPC Joint Undertaking – which serves as the EU’s concerted supercomputing play – announced its first exascale system: JUPITER, set to be installed by the Jülich Supercomputing Centre (FZJ) in 2023. But EuroHPC has been preparing for the exascale era for a much longer time: eight months... Read more…

Nvidia R&D Chief on How AI is Improving Chip Design

April 18, 2022

Getting a glimpse into Nvidia’s R&D has become a regular feature of the spring GTC conference with Bill Dally, chief scientist and senior vice president of research, providing an overview of Nvidia’s R&D organization and a few details on current priorities. This year, Dally focused mostly on AI tools that Nvidia is both developing and using in-house to improve... Read more…

Royalty-free stock illustration ID: 1919750255

Intel Says UCIe to Outpace PCIe in Speed Race

May 11, 2022

Intel has shared more details on a new interconnect that is the foundation of the company’s long-term plan for x86, Arm and RISC-V architectures to co-exist in a single chip package. The semiconductor company is taking a modular approach to chip design with the option for customers to cram computing blocks such as CPUs, GPUs and AI accelerators inside a single chip package. Read more…

The Final Frontier: US Has Its First Exascale Supercomputer

May 30, 2022

In April 2018, the U.S. Department of Energy announced plans to procure a trio of exascale supercomputers at a total cost of up to $1.8 billion dollars. Over the ensuing four years, many announcements were made, many deadlines were missed, and a pandemic threw the world into disarray. Now, at long last, HPE and Oak Ridge National Laboratory (ORNL) have announced that the first of those... Read more…

US Senate Passes CHIPS Act Temperature Check, but Challenges Linger

July 19, 2022

The U.S. Senate on Tuesday passed a major hurdle that will open up close to $52 billion in grants for the semiconductor industry to boost manufacturing, supply chain and research and development. U.S. senators voted 64-34 in favor of advancing the CHIPS Act, which sets the stage for the final consideration... Read more…

Top500: Exascale Is Officially Here with Debut of Frontier

May 30, 2022

The 59th installment of the Top500 list, issued today from ISC 2022 in Hamburg, Germany, officially marks a new era in supercomputing with the debut of the first-ever exascale system on the list. Frontier, deployed at the Department of Energy’s Oak Ridge National Laboratory, achieved 1.102 exaflops in its fastest High Performance Linpack run, which was completed... Read more…

Newly-Observed Higgs Mode Holds Promise in Quantum Computing

June 8, 2022

The first-ever appearance of a previously undetectable quantum excitation known as the axial Higgs mode – exciting in its own right – also holds promise for developing and manipulating higher temperature quantum materials... Read more…

AMD’s MI300 APUs to Power Exascale El Capitan Supercomputer

June 21, 2022

Additional details of the architecture of the exascale El Capitan supercomputer were disclosed today by Lawrence Livermore National Laboratory’s (LLNL) Terri Read more…

PsiQuantum’s Path to 1 Million Qubits

April 21, 2022

PsiQuantum, founded in 2016 by four researchers with roots at Bristol University, Stanford University, and York University, is one of a few quantum computing startups that’s kept a moderately low PR profile. (That’s if you disregard the roughly $700 million in funding it has attracted.) The main reason is PsiQuantum has eschewed the clamorous public chase for... Read more…

Leading Solution Providers

Contributors

ISC 2022 Booth Video Tours

AMD
AWS
DDN
Dell
Intel
Lenovo
Microsoft
PENGUIN SOLUTIONS

Exclusive Inside Look at First US Exascale Supercomputer

July 1, 2022

HPCwire takes you inside the Frontier datacenter at DOE's Oak Ridge National Laboratory (ORNL) in Oak Ridge, Tenn., for an interview with Frontier Project Direc Read more…

AMD Opens Up Chip Design to the Outside for Custom Future

June 15, 2022

AMD is getting personal with chips as it sets sail to make products more to the liking of its customers. The chipmaker detailed a modular chip future in which customers can mix and match non-AMD processors in a custom chip package. "We are focused on making it easier to implement chips with more flexibility," said Mark Papermaster, chief technology officer at AMD during the analyst day meeting late last week. Read more…

Intel Reiterates Plans to Merge CPU, GPU High-performance Chip Roadmaps

May 31, 2022

Intel reiterated it is well on its way to merging its roadmap of high-performance CPUs and GPUs as it shifts over to newer manufacturing processes and packaging technologies in the coming years. The company is merging the CPU and GPU lineups into a chip (codenamed Falcon Shores) which Intel has dubbed an XPU. Falcon Shores... Read more…

Nvidia, Intel to Power Atos-Built MareNostrum 5 Supercomputer

June 16, 2022

The long-troubled, hotly anticipated MareNostrum 5 supercomputer finally has a vendor: Atos, which will be supplying a system that includes both Nvidia and Inte Read more…

India Launches Petascale ‘PARAM Ganga’ Supercomputer

March 8, 2022

Just a couple of weeks ago, the Indian government promised that it had five HPC systems in the final stages of installation and would launch nine new supercomputers this year. Now, it appears to be making good on that promise: the country’s National Supercomputing Mission (NSM) has announced the deployment of “PARAM Ganga” petascale supercomputer at Indian Institute of Technology (IIT)... Read more…

Is Time Running Out for Compromise on America COMPETES/USICA Act?

June 22, 2022

You may recall that efforts proposed in 2020 to remake the National Science Foundation (Endless Frontier Act) have since expanded and morphed into two gigantic bills, the America COMPETES Act in the U.S. House of Representatives and the U.S. Innovation and Competition Act in the U.S. Senate. So far, efforts to reconcile the two pieces of legislation have snagged and recent reports... Read more…

AMD Lines Up Alternate Chips as It Eyes a ‘Post-exaflops’ Future

June 10, 2022

Close to a decade ago, AMD was in turmoil. The company was playing second fiddle to Intel in PCs and datacenters, and its road to profitability hinged mostly on Read more…

Exascale Watch: Aurora Installation Underway, Now Open for Reservations

May 10, 2022

Installation has begun on the Aurora supercomputer, Rick Stevens (associate director of Argonne National Laboratory) revealed today during the Intel Vision event keynote taking place in Dallas, Texas, and online. Joining Intel exec Raja Koduri on stage, Stevens confirmed that the Aurora build is underway – a major development for a system that is projected to deliver more... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire