AI Needs Intelligent HPC infrastructure

By Stephen Meserve, IBM Enterprise AI

December 2, 2019

Artificial Intelligence (AI) has revolutionized entire industries and enables humanity to solve some of the most daunting challenges. To accomplish this, it requires massive amounts of data from heterogeneous sources that is processed it new ways that differs significantly from HPC applications. When AI is run on traditional high-performance computing infrastructure (HPC), this divergence in data types and process methodologies creates new challenges. The most commonly named domains were medical imaging and autonomous driving, according to a 2018 Intersect360 Research survey.1

Unlike traditional computer programs which are either deterministic or probabilistic in how they arrive at results, AI applications use an experiential approach. Based on a repository of past data, AI, or more technically – machine learning – makes predictions based on pattern recognition. We refer to this type of application as “artificial intelligence” because it mimics how we as humans typically learn.

A recent Intersect360 Research white paper, which explores how AI is moving enterprise and HPC infrastructure in new directions, notes, “Part of the need for AI comes from the vast increase in data that is stored and accessible. Much of this data falls outside of traditional values, such as numbers and texts, and instead in modern, multimedia forms, such as images, audio, and video streams. This rich media falls into a category that humans are able to process but has defied computers’ ability to search.”




The source data for 36% of AI applications

is image-based, and overall, 73% of AI

applications work with image, video, audio,

or sensor data in some form.




How HPC infrastructure is evolving for AI applications

HPC technology serves as the infrastructure foundation for Enterprise AI. However, processing, moving, and storing multimedia types of data is not the norm for HPC infrastructure, bringing new challenges with it. Fortunately, some existing technologies have adapted well to the growing need to process multimedia data types for AI use cases, most notably – graphics processing units (GPU).

An NVIDIA GPU performs the vector and matrix computations that underlie neural network layers. GPUs do so in a parallel way, providing vastly improved training speeds with better energy efficiency. GPUs emerged as HPC accelerators in the mid-2000s and have seen a steady increase in adoption since that time. An NVIDIA GPU-accelerated application makes use of the GPU as a co- processor, breaking out computationally intensive portions of algorithms to be run faster on the GPU than on the host microprocessor, which carries overhead such as inter-node communication, job management, and running the operating environment. The availability of NVIDIA GPUs as accelerated computational elements has helped speed the adoption of AI in HPC environments.

[Learn more: Intersect360 Research white paper IBM and NVIDIA Solutions Power Insights with the New AI.]


For technology solution providers, the key is to integrate processors, NVIDIA GPUs, and appropriate software stacks into a unified platform designed specifically for AI. IBM has been an industry leader in the AI domain since Watson captivated audiences by beating human champions on the popular trivia game show Jeopardy! NVIDIA is also a major player in this space – 90% of accelerator based HPC systems incorporate NVIDIA GPUs for computation.2 Recently, these marketplace powerhouses have been working together to develop HPC infrastructure solutions that can power AI well into the future.

The IBM approach to machine learning involves moving the accelerator and related software development toward its partner, NVIDIA, and focusing instead on system architecture, data bandwidth, and system software solutions, where it has strengths. For example, the IBM Power System AC922 model integrates dual POWER9 processors and up to four NVIDIA V100 Tensor Core GPUs. It specifically targets AI workloads.

[Also read: Building a Solid IA for Your AI in Precision Medicine]


What sets the IBM approach apart is the ability of these AI-focused systems to move data quickly in and out of memory and between processors and GPU accelerators to keep computational engines working at peak efficiency and productivity. This results in real-world advantages for AI workloads. For example, the IBM Power and NVIDIA solutions may reduce AI model training times by nearly fourfold over competing x86 architecture-based systems.3 And in addition to the time savings, these solutions allow the combination of memory components across both CPU and GPU to support larger models in memory.

The synergies created by the IBM and NVIDIA partnership have already been demonstrated at the largest scales. Currently, the two most powerful AI-enhanced supercomputers on the planet – Summit at Oak Ridge and Sierra at Lawrence Livermore National Labs – are built from IBM Power processors and NVIDIA GPUs. A key to these installations is the fact that they were assembled using only commercially available components. Leveraging this crucial ingredient, the companies have announced a number of commercial offerings based on the IBM Power Systems and NVIDIA GPU combination – from “Summit lite” versions of the processing stack to converged systems featuring IBM Spectrum Scale-based storage solutions.

Human existence suggests to us that the more experience we acquire, the more confidently we can apply it to new situations. Here we have the revolution in AI. Through the Internet, cloud-scale computing, and worldwide digital connectedness we are now able to train machine learning algorithms with vastly more data than ever before – but only by leveraging the advanced AI platforms and accelerated computing solutions provided by market leaders such as IBM and NVIDIA.




1 Intersect360 Research White Paper: IBM and NVIDIA Solutions Power Insights with the New AI, September 2019

2 Intersect360 Research HPC User Site Census survey data, 2019

3 IBM research: 1,000 iterations of Enlarged GoogleNet model on Enlarged Imagenet Dataset (2240×2240), test run by IBM. Power AC922, 40 cores (2 x 20c chips), POWER9 with NVLink 2.0, 4x NVIDIA Tesla V100 GPU, versus 2x Intel Xeon E5-2640 v4, 20 cores (2 x 10c chips), 40 threads, 2.4 GHz, 4x NVIDIA Tesla V100 GPU.

Return to Solution Channel Homepage