AWS Reveals Gaudi-based EC2 Instances Coming in 2021

By Todd R. Weiss

December 2, 2020

Amazon Web Services has a broad swath of new and bolstered services coming for customers in 2021, from the implementation of powerful Habana Gaudi AI hardware in Amazon EC2 instances for machine learning workloads to custom-designed AWS Trainium ML training chips built to cut cloud training costs.

The new products were announced on Tuesday (Dec. 1) by AWS CEO Andy Jassy, who opened the company’s 9th annual re:Invent educational and learning conference with a virtual keynote from Seattle for the first time ever due to the COVID-19 pandemic.

Also slated for arrival in 2021 are new Graviton2-equipped AWS instances, new AWS GP3 general purpose data storage volumes, and on-premises ECS Anywhere (Elastic Container Service) and EKS Anywhere (Elastic Kubernetes Service) offerings that allow customers to run Amazon services inside their own datacenters for the first time. Other upcoming products and service updates include the introduction of the AWS Aurora Serverless v.2, and new Lambda Containers Support.

Gaudi processor high-level architecture

The new Habana Gaudi-based Amazon EC2 instances will be offered in the first half of 2021, said Jassy, through a partnership between AWS and Intel, which acquired Habana Labs for $2 billion in 2019. The Gaudi accelerators promise 40 percent better price-performance than the best performing GPU instances today, according to AWS.

“It will work with all the main machine learning frameworks, PyTorch as well as TensorFlow,” and will help the company keep pushing the price-performance envelope and machine learning training advancements, said Jassy. The Gaudi accelerators are designed for training deep learning models for workloads that include natural language processing, object detection and machine learning training, classification, recommendation and personalization.

Up to eight Habana Gaudi accelerators will power each EC2 ML instance, and a fully-equipped instance can process about 12,000 images-per-second training the ResNet-50 model on TensorFlow, according to Intel. Gaudi-based EC2 instances are designed to deliver increased performance and greater cost efficiencies for customers, while allowing developers to build new or port existing training models from graphics processing units to Gaudi accelerators.

Each Gaudi chip provides 32GB of HBM2 memory and implements 10 ports of standard 100 Gigabit Ethernet. Native RDMA over Converged Ethernet connects the chips within the server, and multiple Gaudi servers can be clustered using AWS Elastic Fabric Adapter (EFA) technology to enable scalable distributed training.

Current generation Gaudi chips are fabricated on TSMC’s 16nm process, and Habana plans for the follow-on Gaudi2 to use TSMC 7nm. Intel’s Habana Labs also makes the an inference-focused chip, called Goya.

AWS Trainium Chips

The company’s all-new AWS Trainium chips are machine learning chips that are custom-designed by AWS to deliver the most cost-effective training in the cloud, according to Jassy.

According to AWS, Trainium provides the highest performance with the most teraflops (TFLOPS) of compute power for ML in the cloud, while also enabling a broader set of ML applications. Trainium chips are optimized for deep learning training workloads for applications including image classification, semantic search, translation, voice recognition, natural language processing and recommendation engines.

“Trainium will be even more cost-effective than the Habana chip” and will support all the major frameworks, including TensorFlow, PyTorch and [Apache] MXnet, he said. “You’re going to use the same [AWS] Neuron SDK, that our Inferentia customers use. So, if you use Inferentia for inference it will be easy to also get going on our machine learning chip Trainium. It’ll be available both as an EC2 instance as well as in [the AWS] Sagemaker [ML service] in the second half of 2021.”

Karl Freund, senior analyst at Moor Insights and Strategy, called Trainium “a fitting bookend to Inferentia,” AWS’s inference chip that was revealed in 2018 and deployed last year.

“Supporting Trainium, Gaudi and Nvidia GPUs is a smart move,” Freund wrote for Forbes, “and it is consistent with AWS’s strategy of offering customers a variety of technologies to meet their specific needs.”

New Graviton2-Powered Instances

Designed for compute-heavy and network-heavy workloads, AWS will also debut new C6gn instances in the next couple weeks, powered by Amazon’s Arm-based Graviton2 chips. The new instances will include 100 gigabit-per-second performance capabilities that promise to save money for customers while increasing speeds, said Jassy.

Also coming soon are new general purpose AWS GP3 (General Purpose) volumes for AWS Elastic Block Store (EBS). GP3 volumes are evolving from the previous generation of GP2 volumes that were introduced in 2014.

“The feedback that we’ve gotten the last year or two from customers is that we love GP2, but if we had a wish list, there’s a couple things that we’d like from you,” including lower costs per gigabyte and the ability to scale throughput or IOPS without also having to scale its storage, said Jassy.

The AWS team worked on those requests, resulting in new GP3 volumes that have 20 percent lower costs per gigabyte with the ability to provision IOPS and throughput separately from storage, he said.

“The baseline performance if you do GP3 volumes is 3,000 IOPS and 125 megabytes per second, but you can burst that and scale that up to a peak of 1,000 megabytes per second, which is four times that of GP2,” said Jassy. “And you’ll see that customers will be able to run many more of their demanding workloads on GP3 that they even were running on GP2.”

ECS Anywhere and EKS Anywhere

AWS has offered its managed Elastic Container Service (ECS) and Elastic Kubernetes Service (EKS) services to customers for several years, with more than 100,000 active ECS customers using it with billions of compute hours on EKS every week on AWS, according to Jassy. Amazon ECS is a fully managed container orchestration service, while Amazon EKS gives users a managed environment where they can start, run, and scale Kubernetes applications in the AWS cloud. Some customers, however, prefer to run these workloads on-premises, which they couldn’t do with the existing ECS and EKS services, Jassy added.

With those requests repeated by many customers, new on-premises Amazon ECS Anywhere and EKS Anywhere offerings will now be available, giving customers the options they wanted, said Jassy.

“ECS Anywhere allows you to have all the same AWS style API’s and cluster configuration management pieces on-premises that you have in the cloud, so it makes it easy,” he said. “It works with all your on-premises infrastructure.”

That led to some EKS customers wanting the same capabilities, which led to the creation of EKS Anywhere for Kubernetes users, which lets EKS customers run the services in their own datacenters, according to Jassy.

Some EKS customers were so excited about the coming services in 2021 that AWS is now making the EKS Kubernetes distribution open source so that customers can start using it now, he added. “It will be exactly the same as what we do with EKS. We’ll make all the same patches and updates so you can actually be starting to transition as you get ready for EKS Anywhere.”

Gartner analyst Arun Chandrasekaran said that while the vast majority of clients will continue to use the services through hybrid cloud deployments, the new services offer flexibility.

“The ECS Anywhere and EKS Anywhere products provide customers with a hybrid cloud option of running application containers in a consistent manner across on-premises and AWS public cloud,” he said. “While ECS offers more operational simplicity across a hybrid environment, the EKS offering extends Kubernetes into customer datacenters.”

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

San Diego Supercomputer Center Opens ‘Expanse’ to Industry Users

April 15, 2021

When San Diego Supercomputer Center (SDSC) at the University of California San Diego was getting ready to deploy its flagship Expanse supercomputer for the large research community it supports, it also sought to optimize Read more…

GTC21: Dell Building Cloud Native Supercomputers at U Cambridge and Durham

April 14, 2021

In conjunction with GTC21, Dell Technologies today announced new supercomputers at universities across DiRAC (Distributed Research utilizing Advanced Computing) in the UK with plans to explore use of Nvidia BlueField DPU technology. The University of Cambridge will expand... Read more…

The Role and Potential of CPUs in Deep Learning

April 14, 2021

Deep learning (DL) applications have unique architectural characteristics and efficiency requirements. Hence, the choice of computing system has a profound impact on how large a piece of the DL pie a user can finally enj Read more…

GTC21: Nvidia Launches cuQuantum; Dips a Toe in Quantum Computing

April 13, 2021

Yesterday Nvidia officially dipped a toe into quantum computing with the launch of cuQuantum SDK, a development platform for simulating quantum circuits on GPU-accelerated systems. As Nvidia CEO Jensen Huang emphasized in his keynote, Nvidia doesn’t plan to build... Read more…

Nvidia Aims Clara Healthcare at Drug Discovery, Imaging via DGX

April 12, 2021

Nvidia Corp. continues to expand its Clara healthcare platform with the addition of computational drug discovery and medical imaging tools based on its DGX A100 platform, related InfiniBand networking and its AGX developer kit. The Clara partnerships announced during... Read more…

AWS Solution Channel

Research computing with RONIN on AWS

To allow more visibility into and management of Amazon Web Services (AWS) resources and expenses and minimize the cloud skills training required to operate these resources, AWS Partner RONIN created the RONIN research computing platform. Read more…

Nvidia Serves Up Its First Arm Datacenter CPU ‘Grace’ During Kitchen Keynote

April 12, 2021

Today at Nvidia’s annual spring GPU Technology Conference (GTC), held virtually once more due to the pandemic, the company unveiled its first ever Arm-based CPU, called Grace in honor of the famous American programmer Grace Hopper. The announcement of the new... Read more…

San Diego Supercomputer Center Opens ‘Expanse’ to Industry Users

April 15, 2021

When San Diego Supercomputer Center (SDSC) at the University of California San Diego was getting ready to deploy its flagship Expanse supercomputer for the larg Read more…

GTC21: Dell Building Cloud Native Supercomputers at U Cambridge and Durham

April 14, 2021

In conjunction with GTC21, Dell Technologies today announced new supercomputers at universities across DiRAC (Distributed Research utilizing Advanced Computing) in the UK with plans to explore use of Nvidia BlueField DPU technology. The University of Cambridge will expand... Read more…

The Role and Potential of CPUs in Deep Learning

April 14, 2021

Deep learning (DL) applications have unique architectural characteristics and efficiency requirements. Hence, the choice of computing system has a profound impa Read more…

GTC21: Nvidia Launches cuQuantum; Dips a Toe in Quantum Computing

April 13, 2021

Yesterday Nvidia officially dipped a toe into quantum computing with the launch of cuQuantum SDK, a development platform for simulating quantum circuits on GPU-accelerated systems. As Nvidia CEO Jensen Huang emphasized in his keynote, Nvidia doesn’t plan to build... Read more…

Nvidia Aims Clara Healthcare at Drug Discovery, Imaging via DGX

April 12, 2021

Nvidia Corp. continues to expand its Clara healthcare platform with the addition of computational drug discovery and medical imaging tools based on its DGX A100 platform, related InfiniBand networking and its AGX developer kit. The Clara partnerships announced during... Read more…

Nvidia Serves Up Its First Arm Datacenter CPU ‘Grace’ During Kitchen Keynote

April 12, 2021

Today at Nvidia’s annual spring GPU Technology Conference (GTC), held virtually once more due to the pandemic, the company unveiled its first ever Arm-based CPU, called Grace in honor of the famous American programmer Grace Hopper. The announcement of the new... Read more…

Nvidia Debuts BlueField-3 – Its Next DPU with Big Plans for an Expanded Role

April 12, 2021

Nvidia today announced its next generation data processing unit (DPU) – BlueField-3 – adding more substance to its evolving concept of the DPU as a full-fledged partner to CPUs and GPUs in delivering advanced computing. Nvidia is pitching the DPU as an active engine... Read more…

Nvidia’s Newly DPU-Enabled SuperPod Is a Multi-Tenant, Cloud-Native Supercomputer

April 12, 2021

At GTC 2021, Nvidia has announced an upgraded iteration of its DGX SuperPods, calling the new offering “the first cloud-native, multi-tenant supercomputer.” Read more…

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

CERN Is Betting Big on Exascale

April 1, 2021

The European Organization for Nuclear Research (CERN) involves 23 countries, 15,000 researchers, billions of dollars a year, and the biggest machine in the worl Read more…

Programming the Soon-to-Be World’s Fastest Supercomputer, Frontier

January 5, 2021

What’s it like designing an app for the world’s fastest supercomputer, set to come online in the United States in 2021? The University of Delaware’s Sunita Chandrasekaran is leading an elite international team in just that task. Chandrasekaran, assistant professor of computer and information sciences, recently was named... Read more…

HPE Launches Storage Line Loaded with IBM’s Spectrum Scale File System

April 6, 2021

HPE today launched a new family of storage solutions bundled with IBM’s Spectrum Scale Erasure Code Edition parallel file system (description below) and featu Read more…

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

Saudi Aramco Unveils Dammam 7, Its New Top Ten Supercomputer

January 21, 2021

By revenue, oil and gas giant Saudi Aramco is one of the largest companies in the world, and it has historically employed commensurate amounts of supercomputing Read more…

Quantum Computer Start-up IonQ Plans IPO via SPAC

March 8, 2021

IonQ, a Maryland-based quantum computing start-up working with ion trap technology, plans to go public via a Special Purpose Acquisition Company (SPAC) merger a Read more…

Leading Solution Providers

Contributors

Can Deep Learning Replace Numerical Weather Prediction?

March 3, 2021

Numerical weather prediction (NWP) is a mainstay of supercomputing. Some of the first applications of the first supercomputers dealt with climate modeling, and Read more…

Livermore’s El Capitan Supercomputer to Debut HPE ‘Rabbit’ Near Node Local Storage

February 18, 2021

A near node local storage innovation called Rabbit factored heavily into Lawrence Livermore National Laboratory’s decision to select Cray’s proposal for its CORAL-2 machine, the lab’s first exascale-class supercomputer, El Capitan. Details of this new storage technology were revealed... Read more…

New Deep Learning Algorithm Solves Rubik’s Cube

July 25, 2018

Solving (and attempting to solve) Rubik’s Cube has delighted millions of puzzle lovers since 1974 when the cube was invented by Hungarian sculptor and archite Read more…

African Supercomputing Center Inaugurates ‘Toubkal,’ Most Powerful Supercomputer on the Continent

February 25, 2021

Historically, Africa hasn’t exactly been synonymous with supercomputing. There are only a handful of supercomputers on the continent, with few ranking on the Read more…

The History of Supercomputing vs. COVID-19

March 9, 2021

The COVID-19 pandemic poses a greater challenge to the high-performance computing community than any before. HPCwire's coverage of the supercomputing response t Read more…

AMD Launches Epyc ‘Milan’ with 19 SKUs for HPC, Enterprise and Hyperscale

March 15, 2021

At a virtual launch event held today (Monday), AMD revealed its third-generation Epyc “Milan” CPU lineup: a set of 19 SKUs -- including the flagship 64-core, 280-watt 7763 part --  aimed at HPC, enterprise and cloud workloads. Notably, the third-gen Epyc Milan chips achieve 19 percent... Read more…

HPE Names Justin Hotard New HPC Chief as Pete Ungaro Departs

March 2, 2021

HPE CEO Antonio Neri announced today (March 2, 2021) the appointment of Justin Hotard as general manager of HPC, mission critical solutions and labs, effective Read more…

Microsoft, HPE Bringing AI, Edge, Cloud to Earth Orbit in Preparation for Mars Missions

February 12, 2021

The International Space Station will soon get a delivery of powerful AI, edge and cloud computing tools from HPE and Microsoft Azure to expand technology experi Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire