AWS and NVIDIA Announce Strategic Collaboration to Offer New Supercomputing Infrastructure, Software and Services for Generative AI

November 29, 2023

Nov. 29, 2023 — Amazon Web Services, Inc. (AWS) and NVIDIA have announced an expansion of their strategic collaboration to deliver the most-advanced infrastructure, software and services to power customers’ generative artificial intelligence (AI) innovations.

The companies will bring together the best of NVIDIA and AWS technologies—from NVIDIA’s newest multi-node systems featuring next-generation GPUs, CPUs and AI software, to AWS Nitro System advanced virtualization and security, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability—that are ideal for training foundation models and building generative AI applications.

The expanded collaboration builds on a longstanding relationship that has fueled the generative AI era by offering early machine learning (ML) pioneers the compute performance required to advance the state-of-the-art in these technologies.

As part of the expanded collaboration to supercharge generative AI across all industries:

  • AWS will be the first cloud provider to bring NVIDIA GH200 Grace Hopper Superchips with new multi-node NVLink technology to the cloud. The NVIDIA GH200 NVL32 multi-node platform connects 32 Grace Hopper Superchips with NVIDIA NVLink and NVSwitch technologies into one instance. The platform will be available on Amazon Elastic Compute Cloud (Amazon EC2) instances connected with Amazon’s powerful networking (EFA), supported by advanced virtualization (AWS Nitro System), and hyper-scale clustering (Amazon EC2 UltraClusters), enabling joint customers to scale to thousands of GH200 Superchips.
  • NVIDIA and AWS will collaborate to host NVIDIA DGX Cloud—NVIDIA’s AI-training-as-a-service—on AWS. It will be the first DGX Cloud featuring GH200 NVL32, providing developers the largest shared memory in a single instance. DGX Cloud on AWS will accelerate training of cutting-edge generative AI and large language models that can reach beyond 1 trillion parameters.
  • NVIDIA and AWS are partnering on Project Ceiba to design the world’s fastest GPU-powered AI supercomputer—an at-scale system with GH200 NVL32 and Amazon EFA interconnect hosted by AWS for NVIDIA’s own research and development team. This first-of-its-kind supercomputer—featuring 16,384 NVIDIA GH200 Superchips and capable of processing 65 exaflops of AI—will be used by NVIDIA to propel its next wave of generative AI innovation.
  • AWS will introduce three additional new Amazon EC2 instances: P5e instances, powered by NVIDIA H200 Tensor Core GPUs, for large-scale and cutting-edge generative AI and HPC workloads, and G6 and G6e instances, powered by NVIDIA L4 GPUs and NVIDIA L40S GPUs, respectively, for a wide set of applications such as AI fine-tuning, inference, graphics and video workloads. G6e instances are particularly suitable for developing 3D workflows, digital twins and other applications using NVIDIA Omniverse, a platform for connecting and building generative AI-enabled 3D applications.

“AWS and NVIDIA have collaborated for more than 13 years, beginning with the world’s first GPU cloud instance. Today, we offer the widest range of NVIDIA GPU solutions for workloads including graphics, gaming, high performance computing, machine learning, and now, generative AI,” said Adam Selipsky, CEO at AWS. “We continue to innovate with NVIDIA to make AWS the best place to run GPUs, combining next-gen NVIDIA Grace Hopper Superchips with AWS’s EFA powerful networking, EC2 UltraClusters’ hyper-scale clustering, and Nitro’s advanced virtualization capabilities.”

“Generative AI is transforming cloud workloads and putting accelerated computing at the foundation of diverse content generation,” said Jensen Huang, founder and CEO of NVIDIA. “Driven by a common mission to deliver cost-effective state-of-the-art generative AI to every customer, NVIDIA and AWS are collaborating across the entire computing stack, spanning AI infrastructure, acceleration libraries, foundation models, to generative AI services.”

New Amazon EC2 Instances Combine State-of-the-Art from NVIDIA and AWS

AWS will be the first cloud provider to offer NVIDIA GH200 Grace Hopper Superchips with multi-node NVLink technology. Each GH200 Superchip combines an Arm-based Grace CPU with an NVIDIA Hopper architecture GPU on the same module. A single Amazon EC2 instance with GH200 NVL32 can provide up to 20 TB of shared memory to power terabyte-scale workloads.

These instances will take advantage of AWS’s third-generation Elastic Fabric Adapter (EFA) interconnect, providing up to 400 Gbps per Superchip of low-latency, high-bandwidth networking throughput, enabling customers to scale to thousands of GH200 Superchips in EC2 UltraClusters.

AWS instances with GH200 NVL32 will provide customers on-demand access to supercomputer-class performance, which is critical for large-scale AI/ML workloads that need to be distributed across multiple nodes for complex generative AI workloads—spanning FMs, recommender systems, and vector databases.

NVIDIA GH200-powered EC2 instances will feature 4.5 TB of HBM3e memory—a 7.2x increase compared to current generation H100-powered EC2 P5d instances—allowing customers to run larger models, while improving training performance. Additionally, CPU-to-GPU memory interconnect provides up to 7x higher bandwidth than PCIe, enabling chip-to-chip communications that extend the total memory available for applications.

AWS instances with GH200 NVL32 will be the first AI infrastructure on AWS to feature liquid cooling to help ensure densely-packed server racks can efficiently operate at maximum performance.

EC2 instances with GH200 NVL32 will also benefit from the AWS Nitro System, the underlying platform for next-generation EC2 instances. The Nitro System offloads I/O for functions from the host CPU/GPU to specialized hardware to deliver more consistent performance, while its enhanced security protects customer code and data during processing.

AWS First to Host NVIDIA DGX Cloud Powered by Grace Hopper

AWS will team up with NVIDIA to host NVIDIA DGX Cloud powered by GH200 NVL32 NVLink infrastructure. NVIDIA DGX Cloud is an AI supercomputing service that gives enterprises fast access to multi-node supercomputing for training the most complex LLM and generative AI models, with integrated NVIDIA AI Enterprise software, and direct access to NVIDIA AI experts.

Massive Project Ceiba Supercomputer to Supercharge NVIDIA’s AI Development

The Project Ceiba supercomputer that AWS and NVIDIA are collaborating on will be integrated with AWS services, such as Amazon Virtual Private Cloud (VPC) encrypted networking and Amazon Elastic Block Store high-performance block storage, giving NVIDIA access to a comprehensive set of AWS capabilities.

NVIDIA will use the supercomputer for research and development to advance AI for LLMs, graphics and simulation, digital biology, robotics, self-driving cars, Earth-2 climate prediction and more.

NVIDIA and AWS Supercharge Generative AI, HPC, Design and Simulation

To power the development, training and inference of the largest LLMs, AWS P5e instances will feature NVIDIA’s latest H200 GPUs that offer 141 GB of HBM3e GPU memory, which is 1.8x larger and 1.4x faster than H100 GPUs. This boost in GPU memory, along with up to 3,200 Gbps of EFA networking enabled by the AWS Nitro System, will enable customers to continue to build, train and deploy their cutting-edge models on AWS.

To deliver cost-effective, energy-efficient solutions for video, AI and graphics workloads, AWS announced new Amazon EC2 G6e instances featuring NVIDIA L40S GPUs and G6 instances powered by L4 GPUs. The new offerings can help startups, enterprises and researchers meet their AI and high-fidelity graphics needs.

G6e instances are built to handle complex workloads such as generative AI and digital twin applications. Using NVIDIA Omniverse, photorealistic 3D simulations can be developed, contextualized and enhanced using real-time data from services such as AWS IoT TwinMaker, intelligent chatbots, assistants, search and summarization. Amazon Robotics and Amazon Fulfillment Centers will be able to integrate digital twins built with NVIDIA Omniverse and AWS IoT TwinMaker to optimize warehouse design and flow, train more intelligent robot assistants and improve deliveries to customers.

L40S GPUs deliver up to 1.45 petaflops of FP8 performance and feature Ray Tracing cores that offer up to 209 teraflops of ray-tracing performance. L4 GPUs featured in G6 instances will deliver a lower-cost, energy-efficient solution for deploying AI models for natural language processing, language translation, AI video and image analysis, speech recognition, and personalization. L40S GPUs also accelerate graphics workloads, such as creating and rendering real-time, cinematic-quality graphics and game streaming. All three instances will be available in the coming year.

NVIDIA Software on AWS Boosts Generative AI Development

In addition, NVIDIA announced software on AWS to boost generative AI development. NVIDIA NeMo Retriever microservice offers new tools to create highly accurate chatbots and summarization tools using accelerated semantic retrieval. NVIDIA BioNeMo, available on Amazon SageMaker now and coming to AWS on NVIDIA DGX Cloud, enables pharmaceutical companies to speed drug discovery by simplifying and accelerating the training of models using their own data.

NVIDIA software on AWS is helping Amazon bring new innovations to its services and operations. AWS is using the NVIDIA NeMo framework to train select next-generation Amazon Titan LLMs. Amazon Robotics has begun leveraging NVIDIA Omniverse Isaac to build digital twins for automating, optimizing and planning its autonomous warehouses in virtual environments before deploying them into the real world.

About NVIDIA

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

About Amazon Web Services

Since 2006, Amazon Web Services (AWS) has been the world’s most comprehensive and broadly adopted cloud. AWS has been continually expanding its services to support virtually any workload, and it now has more than 240 fully featured services for compute, storage, databases, networking, analytics, machine learning and artificial intelligence (AI), Internet of Things (IoT), mobile, security, hybrid, virtual and augmented reality (VR and AR), media, and application development, deployment, and management from 102 Availability Zones within 32 geographic regions, with announced plans for 15 more Availability Zones and five more AWS Regions in Canada, Germany, Malaysia, New Zealand, and Thailand. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—trust AWS to power their infrastructure, become more agile, and lower costs.


Source: AWS

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

A Big Memory Nvidia GH200 Next to Your Desk: Closer Than You Think

February 22, 2024

Students of the microprocessor may recall that the original 8086/8088 processors did not have floating point units. The motherboard often had an extra socket for an optional 8087 math coprocessor. The math coprocessor ma Read more…

IonQ Reports Advance on Path to Networked Quantum Computing

February 22, 2024

IonQ reported reaching a milestone in its efforts to use entangled photon-ion connectivity to scale its quantum computers. IonQ’s quantum computers are based on trapped ions which feature long coherence times and qubit Read more…

Apple Rolls out Post Quantum Security for iOS

February 21, 2024

Think implementing so-called Post Quantum Cryptography (PQC) isn't important because quantum computers able to decrypt current RSA codes don’t yet exist? Not Apple. Today the consumer electronics giant started rolling Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to derive any substantial value from it. However, the GenAI hyp Read more…

QED-C Issues New Quantum Benchmarking Paper

February 20, 2024

The Quantum Economic Development Consortium last week released a new paper on benchmarking – Quantum Algorithm Exploration using Application-Oriented Performance Benchmarks – that builds on earlier work and is an eff Read more…

AWS Solution Channel

Shutterstock 2283618597

Deep-dive into Ansys Fluent performance on Ansys Gateway powered by AWS

Today, we’re going to deep-dive into the performance and associated cost of running computational fluid dynamics (CFD) simulations on AWS using Ansys Fluent through the Ansys Gateway powered by AWS (or just “Ansys Gateway” for the rest of this post). Read more…

Atom Computing Reports Advance in Scaling Up Neutral Atom Qubit Arrays

February 15, 2024

The scale-up challenge facing quantum computing (QC) is daunting and varied. It’s commonly held that 1 million qubits (or more) will be needed to deliver practical fault tolerant QC. It’s also a varied challenge beca Read more…

A Big Memory Nvidia GH200 Next to Your Desk: Closer Than You Think

February 22, 2024

Students of the microprocessor may recall that the original 8086/8088 processors did not have floating point units. The motherboard often had an extra socket fo Read more…

Apple Rolls out Post Quantum Security for iOS

February 21, 2024

Think implementing so-called Post Quantum Cryptography (PQC) isn't important because quantum computers able to decrypt current RSA codes don’t yet exist? Not Read more…

QED-C Issues New Quantum Benchmarking Paper

February 20, 2024

The Quantum Economic Development Consortium last week released a new paper on benchmarking – Quantum Algorithm Exploration using Application-Oriented Performa Read more…

The Pulse of HPC: Tracking 4.5 Million Heartbeats of 3D Coronary Flow

February 15, 2024

Working in Duke University's Randles Lab, Cyrus Tanade, a National Science Foundation graduate student fellow and Ph.D. candidate in biomedical engineering, is Read more…

It Doesn’t Get Much SWEETER: The Winter HPC Computing Festival in Corpus Christi

February 14, 2024

(Main Photo by Visit Corpus Christi CrowdRiff) Texas A&M University's High-Performance Research Computing (HPRC) team hosted the "SWEETER Winter Comput Read more…

Q-Roundup: Diraq’s War Chest, DARPA’s Bet on Topological Qubits, Citi/Classiq Explore Optimization, WEF’s Quantum Blueprint

February 13, 2024

Yesterday, Australian start-up Diraq added $15 million to its war chest (now $120 million) to build a fault tolerant computer based on quantum dots. Last week D Read more…

2024 Winter Classic: Razor Thin Margins in HPL/HPCG

February 12, 2024

The first task for the 11 teams in the 2024 Winter Classic student cluster competition was to run and optimize the LINPACK and HPCG benchmarks. As usual, the Read more…

2024 Winter Classic: We’re Back!

February 9, 2024

The fourth edition of the Winter Classic Invitational Student Cluster Competition is up and running. This year, we have 11 teams of eager students representin Read more…

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

Leading Solution Providers

Contributors

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire