New MLCommons Results Highlight Impressive Competitive AI Gains for Intel

June 27, 2023

June 27, 2023 — Today, MLCommons published results of its industry AI performance benchmark, MLPerf Training 3.0, in which both the Habana Gaudi2 deep learning accelerator and the 4th Gen Intel Xeon Scalable processor delivered impressive training results.

“The latest MLPerf results published by MLCommons validates the TCO value Intel Xeon processors and Intel Gaudi deep learning accelerators provide to customers in the area of AI,” commented Sandra Rivera, Intel’s executive vice president and general manager of the Data Center and AI Group. “Xeon’s built-in accelerators make it an ideal solution to run volume AI workloads on general-purpose processors, while Gaudi delivers competitive performance for large language models and generative AI. Intel’s scalable systems with optimized, easy-to-program open software lowers the barrier for customers and partners to deploy a broad array of AI-based solutions in the data center from the cloud to the intelligent edge.”

The current industry narrative is that generative AI and large language models (LLMs) can run only on Nvidia GPUs. New data shows that Intel’s portfolio of AI solutions provides competitive and compelling options for customers looking to break free from closed ecosystems that limit efficiency and scale.

The latest MLPerf Training 3.0 results underscore the performance of Intel’s products on an array of deep learning models. The maturity of Gaudi2-based software and systems for training was demonstrated at scale on the large language model, GPT-3. Gaudi2 is one of only two semiconductor solutions to submit performance results to the benchmark for LLM training of GPT-3.

Gaudi2 also provides substantially competitive cost advantages to customers, both in server and system costs. The accelerator’s MLPerf-validated performance on GPT-3, computer vision and natural language models, plus upcoming software advances make Gaudi2 an extremely compelling price/performance alternative to Nvidia’s H100.

On the CPU front, the deep learning training performance of 4th Gen Xeon processors with Intel AI engines demonstrated that customers can build with Xeon-based servers a single universal AI system for data pre-processing, model training and deployment to deliver the right combination of AI performance, efficiency, accuracy and scalability.

About the Habana Gaudi2 Results: Training generative AI and large language models requires clusters of servers to meet massive compute requirements at scale. These MLPerf results provide tangible validation of Habana Gaudi2’s outstanding performance and efficient scalability on the most demanding model tested, the 175 billion parameter GPT-3.

Results highlights:

  • Gaudi2 delivered impressive time-to-train on GPT-31: 311 minutes on 384 accelerators.
  • Near-linear 95% scaling from 256 to 384 accelerators on GPT-3 model.
  • Excellent training results on computer vision — ResNet-50 8 accelerators and Unet3D 8 accelerators — and natural language processing models — BERT 8 and 64 accelerators.
  • Performance increases of 10% and 4%, respectively, for BERT and ResNet models as compared to the November submission, evidence of growing Gaudi2 software maturity.
  • Gaudi2 results were submitted “out of the box,” meaning customers can achieve comparable performance results when implementing Gaudi2 on premise or in the cloud.

About Gaudi2 Software Maturity: Software support for the Gaudi platform continues to mature and keep pace with the growing number of generative AI and LLMs in popular demand.

  • Gaudi2’s GPT-3 submission was based on PyTorch and employed the popular DeepSpeed optimization library (part of Microsoft AI at scale), rather than custom software. DeepSpeed enables support of 3D parallelism (Data, Tensor, Pipeline) concurrently, further optimizing scaling performance efficiency on LLMs.
  • Gaudi2 results on the 3.0 benchmark were submitted in the BF16 data type. A significant leap in Gaudi2 performance is expected when software support for FP8 and new features are released in 2023’s third quarter.

About the 4th Gen Xeon Processors Results: As the lone CPU submission among numerous alternative solutions, MLPerf results prove that Intel Xeon processors provide enterprises with out-of-the-box capabilities to deploy AI on general-purpose systems and avoid the cost and complexity of introducing dedicated AI systems.

For a small number of customers who intermittently train large models from scratch, they can use general-purpose CPUs, and often on the Intel-based servers they are already deploying to run their businesses. However, most will use pre-trained models and fine-tune them with their own smaller curated data sets. Intel previously released results demonstrating that this fine-tuning can be accomplished in only minutes using Intel AI software and standard industry open source software.

MLPerf Results Highlights:

  • In the closed division, 4th Gen Xeons could train BERT and ResNet-50 models in less than 50 mins. (47.93 mins.) and less than 90 mins. (88.17 mins.), respectively.
  • With BERT in the open division, the results show that Xeon was able to train the model in about 30 minutes (31.06 mins.) when scaling out to 16 nodes.
  • For the larger RetinaNet model, Xeon was able to achieve a time of 232 mins. on 16 nodes, allowing customers the flexibility of using off-peak Xeon cycles to train their models over the course of a morning, over lunch or overnight.
  • 4th Gen Xeon with Intel Advanced Matrix Extensions (Intel AMX) delivers significant out-of-box performance improvements that span multiple frameworks, end-to-end data science tools and a broad ecosystem of smart solutions.

MLPerf, generally regarded as the most reputable benchmark for AI performance, enables fair and repeatable performance comparison across solutions. Additionally, Intel has surpassed the 100-submission milestone and remains the only vendor to submit public CPU results with industry-standard deep-learning ecosystem software.

These results also highlight the excellent scaling efficiency possible using cost-effective and readily available Intel Ethernet 800 Series network adapters that utilize the open source Intel Ethernet Fabric Suite Software that’s based on Intel oneAPI.

About Intel

Intel (Nasdaq: INTC) is an industry leader, creating world-changing technology that enables global progress and enriches lives. Inspired by Moore’s Law, we continuously work to advance the design and manufacturing of semiconductors to help address our customers’ greatest challenges. By embedding intelligence in the cloud, network, edge and every kind of computing device, we unleash the potential of data to transform business and society for the better.


Source: Intel

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

HPC User Forum: Sustainability at TACC Points to Software

October 3, 2023

Recently, Dan Stanzione, Executive Director, TACC and Associate Vice President for Research, UT-Austin, gave a presentation on HPC sustainability at the Fall 2023 HPC Users Forum. The complete set of slides is available Read more…

Google’s Controversial AI Chip Paper Under Scrutiny Again 

October 3, 2023

A controversial research paper by Google that claimed the superiority of AI techniques in creating chips is under the microscope for the authenticity of its claims. Science publication Nature is investigating Google's c Read more…

Rust Busting: IBM and Boeing Battle Corrosion with Simulations on Quantum Computer

October 3, 2023

The steady research into developing real-world applications for quantum computing is piling up interesting use cases. Today, IBM reported on work with Boeing to simulate corrosion processes to improve composites used in Read more…

Nvidia Delivering New Options for MLPerf and HPC Performance

September 28, 2023

As HPCwire reported recently, the latest MLperf benchmarks are out. Not unsurprisingly, Nvidia was the leader across many categories. The HGX H100 GPU systems, which contain eight H100 GPUs, delivered the highest throughput on every MLPerf inference test in this round. Read more…

Hakeem Oluseyi Explores His Unlikely Journey from the Street to the Stars in SC23 Keynote

September 28, 2023

Defying the odds In the heart of one of the toughest neighborhoods in the country, young Hakeem Oluseyi’s world was a confined space, but his imagination soared to the stars. While other kids roamed the streets, he Read more…

AWS Solution Channel

Shutterstock 2338659951

VorTech Derisks Innovative Technology to Aid Global Water Sustainability Challenges Using Cloud-Native Simulations on AWS

Overview

No more than 1 percent of the world’s water is readily available fresh water, according to the US Geological Survey. Read more…

QCT Solution Channel

QCT and Intel Codeveloped QCT DevCloud Program to Jumpstart HPC and AI Development

Organizations and developers face a variety of issues in developing and testing HPC and AI applications. Challenges they face can range from simply having access to a wide variety of hardware, frameworks, and toolkits to time spent on installation, development, testing, and troubleshooting which can lead to increases in cost. Read more…

Nvidia Takes Another Shot at Trying to Get AI to Mobile Devices

September 28, 2023

Nvidia takes another shot at trying to get to mobile devices Long before the current situation of Nvidia's GPUs holding AI hostage, the company tried to put its chips in mobile devices but failed. The Tegra mobile chi Read more…

Shutterstock 1927423355

Google’s Controversial AI Chip Paper Under Scrutiny Again 

October 3, 2023

A controversial research paper by Google that claimed the superiority of AI techniques in creating chips is under the microscope for the authenticity of its cla Read more…

Rust Busting: IBM and Boeing Battle Corrosion with Simulations on Quantum Computer

October 3, 2023

The steady research into developing real-world applications for quantum computing is piling up interesting use cases. Today, IBM reported on work with Boeing to Read more…

Nvidia Delivering New Options for MLPerf and HPC Performance

September 28, 2023

As HPCwire reported recently, the latest MLperf benchmarks are out. Not unsurprisingly, Nvidia was the leader across many categories. The HGX H100 GPU systems, which contain eight H100 GPUs, delivered the highest throughput on every MLPerf inference test in this round. Read more…

IonQ Announces 2 New Quantum Systems; Suggests Quantum Advantage is Nearing

September 27, 2023

It’s been a busy week for IonQ, the quantum computing start-up focused on developing trapped-ion-based systems. At the Quantum World Congress today, the compa Read more…

Rethinking ‘Open’ for AI

September 27, 2023

What does “open” mean in the context of AI? Must we accept hidden layers? Do copyrights and patents still hold sway? And do consumers have the right to opt Read more…

Aurora Image

Leveraging Machine Learning in Dark Matter Research for the Aurora Exascale System 

September 25, 2023

Scientists have unlocked many secrets about particle interactions at atomic and subatomic levels. However, one mystery that has eluded researchers is dark matte Read more…

Watsonx Brings AI Visibility to Banking Systems

September 21, 2023

A new set of AI-based code conversion tools is available with IBM watsonx. Before introducing the new "watsonx," let's talk about the previous generation Watson Read more…

Intel’s Gelsinger Lays Out Vision and Map at Innovation 2023 Conference

September 20, 2023

Intel’s sprawling, optimistic vision for the future was on full display yesterday in CEO Pat Gelsinger’s opening keynote at the Intel Innovation 2023 confer Read more…

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

Leading Solution Providers

Contributors

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

ISC 2023 Booth Videos

Cornelis Networks @ ISC23
Dell Technologies @ ISC23
Intel @ ISC23
Lenovo @ ISC23
Microsoft @ ISC23
ISC23 Playlist
  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire