Inspur Information Announces MLPerf Inference v2.0 Results

April 8, 2022

SAN JOSE, Calif., April 8, 2022 — MLCommons, a well-known open engineering consortium, released the results of MLPerf Inference v2.0, the leading AI benchmark suite. Inspur AI servers set records in all 16 tasks in the data center Closed division, showcasing the best performance in real-world AI application scenarios.

Inference Performance Improvements from MLPerf v1.1 to v2.0

MLPerf was established by Turing Award winner David Patterson and top academic institutions. It is leading AI performance benchmark in the world, organizing AI inference and AI training tests twice a year to track and evaluate rapidly-growing AI development. MLPerf has two divisions: Closed and Open. The Closed division provides an apples-to-apples comparison between vendors because it requires the use of the same model and optimizer, making it an excellent reference benchmark.

The first AI inference benchmark of MLPerf in 2022 aimed to examine the inference speed and capabilities of computing systems of different manufacturers in various AI tasks. The Closed division for the data center category is the most competitive division. A total of 926 results were submitted, double the submissions from the previous benchmark.

Inspur AI servers set new records in inference performance

The MLPerf AI inference benchmark covers six widely used AI tasks: image classification (ResNet50), natural language processing (BERT), speech recognition (RNN-T), object detection (SSD-ResNet34), medical image segmentation (3D-Unet) and recommendation (DLRM). MLPerf benchmarks require an accuracy of more than 99% of original model. For natural language processing, medical image segmentation and recommendation, two accuracy targets of 99% and 99.9% are set to examine the impact on the computing performance when improving the quality target of AI inference.

In order to more closely match real-world usage, the MLPerf inference tests have two required scenarios for the data center category: offline and server. Offline scenarios mean that all data required for the task is available locally. The server scenario has data delivered online in bursts when requested.

Inspur AI server set a performance record of processing 449,856 images per second in the ResNet50 model task, which is equivalent to completing the classification of 1.28 million images in the ImageNet dataset in only 2.8 seconds. In the 3D-UNet model task, Inspur set a new record for processing 36.25 medical images per second, which is equivalent to completing the segmentation of 207 3D medical images in the KiTS19 dataset within 6 seconds. In the SSD-ResNet34 model task, Inspur set a new record of completing the target object recognition and identification of 11,081.9 images per second. In the BERT model task, Inspur set a performance record of completing 38,776.7 questions and answers per second on average. In the RNNT model task, Inspur set a record of completing 155,811 speech recognition conversions per second on average, and Inspur set the best record of completing 2,645,980 click predictions per second on average in the DLRM model task.

In Edge inference category, Inspur’s AI servers designed for edge scenarios also performed well. NE5260M5, NF5488A5, and NF5688M6 won 11 titles out of 17 tasks in the Closed division.

With the continuous development of AI applications, faster inference processing will bring higher AI application efficiency and capabilities, accelerating the transformation to intelligent industries. Compared with the MLPerf AI inference v1.1, Inspur AI servers have improved image classification, speech recognition and natural language processing tasks by 31.5%, 28.5% and 21.3% respectively. These results mean that Inspur AI server can complete various AI tasks more efficiently and rapidly in scenarios such as autonomous driving, voice conferences, intelligent question and answer, and smart medical care.

Full-stack optimization boosts continuous improvement in AI performance

The outstanding performance of Inspur AI servers in the MLPerf benchmarks is due to Inspur Information’s excellent system design capabilities and full-stack optimization capabilities in AI computing systems.

The Inspur AI server NF5468M6J can support 12x NVIDIA A100 Tensor Core GPUs with a layered and scalable computing architecture, and set 12 MLPerf records. Inspur Information also offers servers supporting 8x 500W NVIDIA A100 GPUs by utilizing liquid and air cooling. Among high-end mainstream models adopting 8x NVIDIA GPUs with NVLink in this benchmark, Inspur AI servers achieved the best results in 14 of 16 tasks in the data center category. Among them, NF5488A5 supports 8x third-generation NVlink A100 GPUs and 2x AMD Milan CPUs in a 4U space. NF5688M6 is an AI server with extreme scalability optimized for hyperscalers. It supports 8x NVIDIA A100 GPUs and 2x Intel Icelake CPUs, and supports up to 13x PCIe Gen4 IO expansion cards.

In the Edge inference category, the NE5260M5, comes with optimized signaling and power systems, and offers widespread compatibility with high-performance CPUs and a wide range of AI accelerator cards. It features a shock-absorbing and noise-reducing design, and has undergone rigorous reliability testing. With a chassis depth of 430 mm, nearly half the depth traditional servers, it is deployable even in space-constrained edge computing scenarios.

Inspur AI servers makes optimized data path between the CPU and GPU through fine calibration and comprehensive optimization of the CPU and GPU hardware. At the software level, by enhancing the round-robin scheduling for multiple GPUs based on GPU topology, the performance of a single GPU or multiple GPUs can be increased nearly linearly. For deep learning, based on computing characteristics of NVIDA GPU Tensor Core unit, performance optimization of the model is achieved through an Inspur-developed channel compression algorithm.

To view the complete results of MLPerf Inference v2.0, please visit https://mlcommons.org/en/inference-datacenter-20 and https://mlcommons.org/en/inference-edge-20.

About Inspur Information

Inspur Information is a leading provider of data center infrastructure, cloud computing, and AI solutions. It is the world’s 2nd largest server manufacturer. Through engineering and innovation, Inspur Information delivers cutting-edge computing hardware design and extensive product offerings to address important technology sectors such as open computing, cloud data center, AI, and deep learning. Performance-optimized and purpose-built, our world-class solutions empower customers to tackle specific workloads and real-world challenges. To learn more, visit https://www.inspursystems.com.


Source: Inspur Information

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Intel’s Silicon Brain System a Blueprint for Future AI Computing Architectures

April 24, 2024

Intel is releasing a whole arsenal of AI chips and systems hoping something will stick in the market. Its latest entry is a neuromorphic system called Hala Point. The system includes Intel's research chip called Loihi 2, Read more…

Anders Dam Jensen on HPC Sovereignty, Sustainability, and JU Progress

April 23, 2024

The recent 2024 EuroHPC Summit meeting took place in Antwerp, with attendance substantially up since 2023 to 750 participants. HPCwire asked Intersect360 Research senior analyst Steve Conway, who closely tracks HPC, AI, Read more…

AI Saves the Planet this Earth Day

April 22, 2024

Earth Day was originally conceived as a day of reflection. Our planet’s life-sustaining properties are unlike any other celestial body that we’ve observed, and this day of contemplation is meant to provide all of us Read more…

Intel Announces Hala Point – World’s Largest Neuromorphic System for Sustainable AI

April 22, 2024

As we find ourselves on the brink of a technological revolution, the need for efficient and sustainable computing solutions has never been more critical.  A computer system that can mimic the way humans process and s Read more…

Empowering High-Performance Computing for Artificial Intelligence

April 19, 2024

Artificial intelligence (AI) presents some of the most challenging demands in information technology, especially concerning computing power and data movement. As a result of these challenges, high-performance computing Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that have occurred about once a decade. With this in mind, the ISC Read more…

Intel’s Silicon Brain System a Blueprint for Future AI Computing Architectures

April 24, 2024

Intel is releasing a whole arsenal of AI chips and systems hoping something will stick in the market. Its latest entry is a neuromorphic system called Hala Poin Read more…

Anders Dam Jensen on HPC Sovereignty, Sustainability, and JU Progress

April 23, 2024

The recent 2024 EuroHPC Summit meeting took place in Antwerp, with attendance substantially up since 2023 to 750 participants. HPCwire asked Intersect360 Resear Read more…

AI Saves the Planet this Earth Day

April 22, 2024

Earth Day was originally conceived as a day of reflection. Our planet’s life-sustaining properties are unlike any other celestial body that we’ve observed, Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that ha Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use o Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Leading Solution Providers

Contributors

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

Intel’s Xeon General Manager Talks about Server Chips 

January 2, 2024

Intel is talking data-center growth and is done digging graves for its dead enterprise products, including GPUs, storage, and networking products, which fell to Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire