Nvidia Expands Its Certified Server Models, Unveils DGX SuperPod Subscriptions

By Todd R. Weiss

June 2, 2021

Nvidia is busy this week at the virtual Computex 2021 Taipei technology show, announcing an expansion of its nascent Nvidia-certified server program, a range of new Nvidia BlueField DPU-equipped server models and the coming availability of its Base Command Platform which will include a subscription option for its DGX SuperPods so customers can give them a try.

Under its expanded certified server program, which was initially unveiled in April at Nvidia’s own GTC21 conference, dozens of new servers are being certified to run the full suite of Nvidia AI enterprise software, giving customers more options for demanding workloads in traditional datacenters or in hybrid cloud infrastructures.

Also announced were more new servers from partners using the company’s latest BlueField-2 data processing units, including from ASUS, Dell Technologies, GIGABYTE, QCT and Supermicro.

The Nvidia announcements also included the news that the Nvidia Base Command Platform, which is available presently only to early access customers after being unveiled at GTC21 in April, will be offered jointly with NetApp as a premium monthly subscription with Nvidia DGX SuperPod AI supercomputers and NetApp data management services.

 

Manuvir Das of Nvidia

The new products are part of the company’s ongoing democratization of AI, Manuvir Das, Nvidia’s head of enterprise computing, said during a May 27 briefing with reporters on the news.

“The work we are doing with the ecosystem is really to get it ready now to fully participate in this coming wave of the democratization of AI, where AI is utilized by every company on the planet rather than just the early adopters,” said Das. “That’s really the theme of what we’ve talked about at Computex.”

That democratization includes taking Nvidia’s software tools, libraries, frameworks and other pieces that the company has built and putting it all into what it is calling Nvidia AI enterprise software, said Das.

Servers Certified to Run Nvidia AI Enterprise Software

That strategy is what is behind the company’s news that it is certifying its enterprise AI software suite on the latest wave of servers from partners including ASUS, Advantech, Altos, ASRock Rack, ASUS, Dell Technologies, GIGABYTE, Hewlett Packard Enterprise, Lenovo, QCT and Supermicro. Presently the number of certified servers includes more than 50. The certified server program is aimed at helping customers in industries such as healthcare, manufacturing, retail and financial services find the mainstream servers they require, according to the company.

The Nvidia systems include certifications for running VMware vSphere, Nvidia Omniverse Enterprise for design collaboration and advanced simulation and Red Hat OpenShift for AI development, as well as with Cloudera data engineering and machine learning.

The systems can be acquired in a wide range of price and performance levels and can be equipped with a wide range of Nvidia hardware, including A100, A40, A30 or A10 Tensor Core GPUs as well as BlueField-2 DPUs or ConnectX-6 adapters.

An earlier group of Nvidia certified servers were unveiled in April at GTC21.

Nvidia further said it would facilitate expanded access to Arm CPUs in 2022 through partnerships with GIGABYTE and Wiwynn. These companies plan to offer new servers featuring Arm Neoverse-based CPUs as well as Nvidia Ampere architecture GPUs or BlueField DPUs (or both), according to Nvidia. These systems will be submitted for Nvidia certification when they come to market.

New BlueField-2-Equipped DPU Servers

With this new round of DPU-2-equipped servers, Nvidia is expanding the line to give customers more options to find just the right servers for their needs, according to the company. The servers are aimed at workloads including software-defined networking, software-defined storage or traditional enterprise applications, which can benefit from the DPU’s ability to accelerate, offload and isolate infrastructure workloads for networking, security and storage, according to Nvidia. The DPU-equipped servers can also benefit systems running VMware vSphere, Windows or hyperconverged infrastructure solutions for AI and machine learning applications, graphics-intensive workloads or traditional business applications.

Nvidia BlueField DPU-2. Image courtesy: Nvidia

Nvidia’s BlueField DPUs – which can be thought of as next-generation SmartNICs – are designed to shift infrastructure tasks from the CPU to the DPU, which makes more server CPU cores available to run applications and increases server and datacenter efficiency, the company states.

The BlueField-2 DPU-accelerated servers are expected this year.

Nvidia Base Command and SuperPod Subscriptions

For customers, the idea behind Nvidia’s Base Command Platform and its related DGX SuperPod subscription option is that it can help companies move their AI projects more quickly from prototypes to production.

The Base Command software platform, which is designed for large-scale, multi-user and multi-team AI development workflows hosted on-premises or in the cloud, enables researchers and data scientists to simultaneously work on accelerated computing resources, according to Nvidia.

The cloud-hosted Base Command Platform will be offered in conjunction with NetApp, including an option to try out a DGX SuperPod on a subscription basis, said Das. Also included is NetApp all-flash storage. More information about these options will be released later this week, according to Nvidia.

Nvidia Base Command Platform management screens. Image courtesy: Nvidia

The Base Command Platform works with DGX systems and other Nvidia accelerated computing platforms, such as those offered by its cloud service provider partners. Many of the features of Base Command were unveiled by the company at GTC21. Base Command Manager is used to manage resources on an on-premises DGX SuperPod. Base Command Platform provides a wide range of controls to manage workflows from anywhere and makes it possible to offer the hosted subscription service with NetApp.

Das said the upcoming is the first time that DGX SuperPod subscriptions are being offered and that the move came after it was requested by customers. “All of the gear is hosted by Nvidia in Equinix datacenters,” he said. “And customers can come into this environment and rent access to a SuperPod or to a smaller part of the SuperPod, and they can rent it for just months at a time.”

For customers, this new option can provide a simple, easy to use experience for AI, said Das.

“What we’re doing here is we’re really lowering the barrier to entry to experience this best of breed system and equipment, and democratizing in that way,” he said. The expectation is that once customers try out the SuperPods that they will buy their own and use them more widely, he added.

Also announced were plans for Google Cloud’s marketplace to add support for Base Command Platform later this year to give its customers access to the additional services.

“This hybrid AI offering will allow enterprises to write once and run anywhere with flexible access to multiple Nvidia A100 Tensor Core GPUs, speeding AI development for enterprises that leverage on-demand accelerated computing,” Manish Sainani, director of product management for machine learning Infrastructure at Google Cloud, said in a statement.

Amazon Web Services (AWS) also has plans to integrate services with the Base Command Platform, providing the ability for Nvidia customers to deploy their workloads from Base Command directly to Amazon Sagemaker using GPU cloud instances.

So far, the Nvidia Base Command Platform with NetApp is only available to early access customers. Monthly subscription pricing starts at $90,000.

Analysts On Nvidia’s Latest News

So, what do industry analysts think about Nvidia’s Computex announcements?

Karl Freund, analyst

And while starting out on a cloud instance of a DGX SuperPod at $90,000 a month may seem rich, it does provide an easy on-ramp for customers, with no hardware to buy and install and no additional software needed, he said.“Nvidia is clearly climbing up the value chain, from chips to systems to software and eventually datacenters,” Karl Freund, founder and principal analyst of Cambrian AI Research, told EnterpriseAI. “The announcements will appeal to enterprises that are starting out on their AI journeys, with a pretty vast array of software to develop, manage, and collaborate on AI applications.”

“Taking out the hassles will help enterprises get started in AI,” said Freund. “When ready for production, these Base Command clients can buy DGX systems, systems from their server vendor, or deploy on public clouds, all with the same software.”

Another analyst, James Kobielus, the senior research director for data communications and management at research, training, and data analytics consultancy TDWI, said he is impressed by Nvidia’s focus on helping customers productionize the full range of its AI software.

 

James Kobielus, analyst

“Most noteworthy is the Base Command Platform, which offers cloud-based access for AI development teams to Nvidia’s most powerful DGX SuperPod AI supercomputer, along with NetApp’s data management suite,” said Kobielus. “Once this offering is available in Google Cloud marketplace later in the year, I expect that many enterprises will shortlist Nvidia Base Command Platform for their development of machine learning apps to be deployed into hybrid cloud environments and run various Nvidia-certified systems from Nvidia partners in support of high-performance enterprise apps.”

Bob Sorensen, an analyst with Hyperion Research, told EnterpriseAI that Nvidia’s DPU-equipped servers provide opportunities for HPC server suppliers to develop new capabilities for intelligent and targeted compute capabilities right where they are needed by customers.

Bob Sorensen, analyst

“The added benefit is that these devices can help offload data management responsibilities from the CPUs, freeing them up for more CPU-relevant tasks,” said Sorensen. “Indeed, one could argue that DPUs such as these could be the harbinger of a new form of HPC design based on composable computing, which seeks to break down and distribute discrete server functions across specific smart devices scattered throughout a traditional HPC architecture.”

Rob Enderle, analyst

The importance of this technology is notable, he said.Rob Enderle, principal analyst with Enderle Group, said that Nvidia appears to be setting up to make a significant push into enterprise servers. “Their DPU technology is mind-bending,” said Enderle. “It frees up significant CPU resources, which can then be applied to other projects. That is particularly ideal for cloud solutions where you need a massive amount of flexibility.”

“This is just the beginning of what is expected to be the most significant effort to displace x86 server technology in over a decade,” said Enderle. “This initiative is only the start and coupled with their Arm HPC Developer Kit with Gigabyte, it anticipates an endgame where x86 becomes obsolete.”

This article originally appeared on sister site EnterpriseAI.news.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Mystery Solved: Intel’s Former HPC Chief Now Running Software Engineering Group 

April 15, 2024

Last year, Jeff McVeigh, Intel's readily available leader of the high-performance computing group, suddenly went silent, with no interviews granted or appearances at press conferences.  It led to questions -- what's Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Institute for Human-Centered AI (HAI) put out a yearly report to t Read more…

Crossing the Quantum Threshold: The Path to 10,000 Qubits

April 15, 2024

Editor’s Note: Why do qubit count and quality matter? What’s the difference between physical qubits and logical qubits? Quantum computer vendors toss these terms and numbers around as indicators of the strengths of t Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips are available off the shelf, a concern raised at many recent Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announced its second fund targeting €200 million. The very idea th Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. In a way, Nvidia is the new Intel IDF, the hottest chip show Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announce Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. I Read more…

Google Announces Homegrown ARM-based CPUs 

April 9, 2024

Google sprang a surprise at the ongoing Google Next Cloud conference by introducing its own ARM-based CPU called Axion, which will be offered to customers in it Read more…

Computational Chemistry Needs To Be Sustainable, Too

April 8, 2024

A diverse group of computational chemists is encouraging the research community to embrace a sustainable software ecosystem. That's the message behind a recent Read more…

Hyperion Research: Eleven HPC Predictions for 2024

April 4, 2024

HPCwire is happy to announce a new series with Hyperion Research  - a fact-based market research firm focusing on the HPC market. In addition to providing mark Read more…

Google Making Major Changes in AI Operations to Pull in Cash from Gemini

April 4, 2024

Over the last week, Google has made some under-the-radar changes, including appointing a new leader for AI development, which suggests the company is taking its Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

Leading Solution Providers

Contributors

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

Intel’s Xeon General Manager Talks about Server Chips 

January 2, 2024

Intel is talking data-center growth and is done digging graves for its dead enterprise products, including GPUs, storage, and networking products, which fell to Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire