HPC and AI – Two Communities Same Future

By Rob Farber

January 25, 2018

In this contributed feature, Rob Farber lays out Intel Fellow Al Gara’s vision for the unification of the “3 Pillars” of HPC currently underway.

According to Al Gara (Intel Fellow, Data Center Group), high performance computing and artificial intelligence will increasingly intertwine as we transition to an exascale future using new computing, storage, and communications technologies as well as neuromorphic and quantum computing chips. Gara observes that, “The convergence of AI, data analytics and traditional simulation will result in systems with broader capabilities and configurability as well as cross pollination.”

Gara sees very aggressive hardware targets being set for this intertwined HPC and AI future, where the hardware will deliver usable performance exceeding one exaflops of double precision performance (and much more for lower and reduced precision arithmetic). He believes a user focus on computation per memory capacity will pay big dividends across architectures and provide systems software and user applications the opportunity to stay on the exponential performance growth curve through exascale and beyond as shown in the performance table below.

Figure 1: Architectural targets for future systems that will support both HPC and AI. Note: PiB is petabytes of memory capacity

Unification of the “3 Pillars”

The vision Gara presented is based on a unification of the “3 Pillars” of HPC: Artificial Intelligence (AI) and Machine Learning (ML); Data Analytics and Big Data; plus High Performance Computing (HPC). What this means is that users of the future will program using models that leverage each other and that interact through memory.

Figure 2: Unifying the “3 Pillars” (Source Intel)

More concretely, Intel is working towards exascale systems that are highly configurable that can support upgrades to fundamentally new technologies including scalable processors, accelerators, neural network processors, neuromorphic chips, FPGAs, Intel persistent memory, 3D NAND, and custom hardware.

Figure 3: Working towards a highly configurable future (Source Intel)

The common denominator in Gara’s vision is that the same architecture will cover HPC, AI, and Data Analytics through configuration, which means there needs to be a consistent software story across these different hardware backends to address HPC plus AI workloads.

A current, very real instantiation of Gara’s vision is happening now through the use of Intel  nGraphT library in popular machine learning packages such as TensorFlow. Essentially, Intel nGraph library is being used as an intermediate language (in a manner analogous to LLVM) that can deliver optimized performance across a variety of hardware platforms from CPUs to FPGAs, dedicated neural network processors, and more.

Jason Knight (CTO office, Intel Artificial Intelligence Products Group) writes, “We see the Intel nGraph library  as the beginning of an ecosystem of optimization passes, hardware backends and frontend connectors to popular deep learning frameworks.”

Figure 4: XLA support for TensorFlow

Overall, Gara noted that “HPC is truly the birthplace of many architectures … and the testing ground” as HPC programmers, researchers, and domain scientists explore the architectural space map the performance landscape:

  • Data level parallel (from fine grain to coarse grain)
  • Energy efficient accelerators (compute density and energy efficiency often are correlated)
  • Exploiting predictable execution at all levels (cache to coarse grain)
  • Integrated fixed function data flow accelerators
  • General purpose data flow accelerators

Technology Opportunities

HPC and AI scientists will have access and the ability to exploit the performance capabilities of a number of new network, storage, and computing architectures.

In particular, HPC is a big driver of optical technology as fabrics represent one of the most challenging and costly elements of a supercomputer. For this reason, Gara believes that silicon photonics is game changing as the ability to integrate silicon and optical devices will deliver significant economic and performance advantages including room to grow (in a technology sense) as we transition to linear and ring devices and optical devices that communicate using multiple wavelengths of light.

New non-volatile storage technologies such as Intel persistent memory are blurring the line between memory and storage. Gara describes a new storage stack for exascale supercomputers, but of course this stack can be implemented on general compute clusters as well.

The key, Gara observes, is that this stack is designed from the ground up to use NVM storage. The result will be high throughput IO operations at arbitrary alignment and transaction sizes because applications can perform ultra-fine grained IO through a new userspace NVMe/pmem software stack. At a systems level, this means that users will be able to manage massively distributed NVM storage using scalable communications and IO operations across homogenous, shared-nothing servers in a software managed redundant, self-healing environment. In other words, high-performance, big-capacity scalable storage to support big-data and in-core algorithms such as log-runtime algorithms and data analytics on sparse and unstructured data sets.

Researchers are exploiting the advances in memory performance and capacity to change the way that we approach AI and HPC problems. Examples of such work range from the University of Utah to King Abdullah University of Science and Technology (KAUST) in Saudi Arabia.

For example, Dr. Aaron Knoll (research scientist, Scientific Computing and Imaging Institute at the University of Utah) stresses the importance the logarithmic runtime algorithms in the Ospray visualization package. Logarithmic runtime algorithms are important for big visualizations and exascale computing. Basically the runtime increases slowly as data sizes increase. The logarithmic growth is important as the runtime increases slowly even when the data size increases by orders of magnitude. Otherwise, the runtime growth can prevent computations from finishing in a reasonable time, thus obviating the benefits of a large memory capacity computer.

As a result, large memory capacity (e.g., “fat”) compute nodes that provide low latency access to data are the enabling technology that can compete and beat massively parallel accelerators at their own game. Research at the University of Utah [PDF] shows a single large memory (three terabyte) workstation can deliver competitive and even superior interactive rendering performance compared to a 128-node GPU cluster. The University of Utah group is also exploring in-situ visualization using P-k-d trees and other fast, in-core approaches [PDF] to show that large “direct” in-core techniques are viable alternatives to traditional HPC visualization approaches.

In a second example, KAUST has been enhancing the ecosystem of numerical tools for multi-core and many-core processors in collaboration with Intel and the Tokyo Institute of Technology. Think of processing really big billion by billion sized matrices using CPU technology in a mathematically and computationally efficient manner.

The importance of these contributions in linear algebra and Fast Multi-pole Methods (FMM) can be appreciated by non-HPC scientists as numerical linear algebra is at the root of nearly all applications in engineering, physics, data science, and machine learning. The FMM method has been listed as one of the top ten algorithms of the 20th century.

Results show that HPC scientists now have the ability to solve faster and larger dense linear algebra problems and FMM related numerical problems than is possible using current highly optimized libraries such as the Intel Math Kernel Library (Intel MKL) running on the same hardware. These methods have been made available in highly optimized libraries bearing the names of ExaFMM, and HiCMA.

Looking to the future: Neuromorphic and Quantum Computing

The new neuromorphic test chips codenamed Loihi may represent a phase change in AI because they “self-learn”. Currently, data scientists spend a significant amount of time working with data to create training sets that are used to train a neural network to solve a complex problem. Neuromorphic chips eliminate the need for a human to create a training set (e.g., no human in the loop). Instead, humans need to validate the accuracy once the neuromorphic hardware has found a solution.

Succinctly, neuromorphic computing utilizes an entirely different computational model than traditional neural networks used in machine and deep learning. This model more accurately mimics how biological brains operate so neuromorphic chips can “learn” in an event driven fashion simply by observing their environment.  Further, they operate in a remarkably energy efficient manner. Time will tell if and when this provides an advantage. The good news is that neuromorphic hardware is now becoming available.

Gara states that the goal is to create a programmable architecture that delivers >100x energy efficiency over current architectures to solve hard AI problems efficiently. He provided examples such as sparse coding, dictionary learning, constraint satisfaction, pattern matching, and dynamic learning and adaptation.

Finally, Gara described advances in quantum computing that are being made possible through a collaboration with Delft University to make better Qubits (a Quantum Bit), improve connectivity between Qubits, and develop scalable IO. Quantum computing is non-intuitive because most people don’t intuitively grasp the idea of entanglement or something being in multiple states at the same time. Still the web contains excellent resources such as Quantum computing 101 at the University of Waterloo to help people make sense of this technology that is rapidly improving and, if realized, will change our computing universe forever.

Quantum computing holds the possibility of solving currently intractable problems using general purpose computers. Gara highlighted applications of the current Intel quantum computing efforts in quantum chemistry, microarchitecture and algorithm co-design, and post-quantum secure cryptography.


We are now seeing the introduction of new computing, storage, and manufacturing technologies that are forcing the AI and HPC communities to rethink their traditional approaches so they can use these ever more performant, scalable, and configurable architectures.  Al Gara pointed out, technologies are causing a unification of the “3 pillars” which, in turn, makes the future of AI and HPC in the data center indistinguishable from each other.

Rob Farber is a global technology consultant and author with an extensive background in HPC and in developing machine learning technology that he applies at national labs and commercial organizations. Rob can be reached at [email protected]

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

GTC 2019: Chief Scientist Bill Dally Provides Glimpse into Nvidia Research Engine

March 22, 2019

Amid the frenzy of GTC this week – Nvidia’s annual conference showcasing all things GPU (and now AI) – William Dally, chief scientist and SVP of research, provided a brief but insightful portrait of Nvidia’s rese Read more…

By John Russell

ORNL Helps Identify Challenges of Extremely Heterogeneous Architectures

March 21, 2019

Exponential growth in classical computing over the last two decades has produced hardware and software that support lightning-fast processing speeds, but advancements are topping out as computing architectures reach thei Read more…

By Laurie Varma

Interview with 2019 Person to Watch Jim Keller

March 21, 2019

On the heels of Intel's reaffirmation that it will deliver the first U.S. exascale computer in 2021, which will feature the company's new Intel Xe architecture, we bring you our interview with our 2019 Person to Watch Jim Keller, head of the Silicon Engineering Group at Intel. Read more…

By HPCwire Editorial Team

HPE Extreme Performance Solutions

HPE and Intel® Omni-Path Architecture: How to Power a Cloud

Learn how HPE and Intel® Omni-Path Architecture provide critical infrastructure for leading Nordic HPC provider’s HPCFLOW cloud service.

powercloud_blog.jpgFor decades, HPE has been at the forefront of high-performance computing, and we’ve powered some of the fastest and most robust supercomputers in the world. Read more…

IBM Accelerated Insights

Insurance: Where’s the Risk?

Insurers are facing extreme competitive challenges in their core businesses. Property and Casualty (P&C) and Life and Health (L&H) firms alike are highly impacted by the ongoing globalization, increasing regulation, and digital transformation of their client bases. Read more…

What’s New in HPC Research: TensorFlow, Buddy Compression, Intel Optane & More

March 20, 2019

In this bimonthly feature, HPCwire highlights newly published research in the high-performance computing community and related domains. From parallel programming to exascale to quantum computing, the details are here. Read more…

By Oliver Peckham

GTC 2019: Chief Scientist Bill Dally Provides Glimpse into Nvidia Research Engine

March 22, 2019

Amid the frenzy of GTC this week – Nvidia’s annual conference showcasing all things GPU (and now AI) – William Dally, chief scientist and SVP of research, Read more…

By John Russell

At GTC: Nvidia Expands Scope of Its AI and Datacenter Ecosystem

March 19, 2019

In the high-stakes race to provide the AI life-cycle solution of choice, three of the biggest horses in the field are IBM, Intel and Nvidia. While the latter is only a fraction of the size of its two bigger rivals, and has been in business for only a fraction of the time, Nvidia continues to impress with an expanding array of new GPU-based hardware, software, robotics, partnerships and... Read more…

By Doug Black

Nvidia Debuts Clara AI Toolkit with Pre-Trained Models for Radiology Use

March 19, 2019

AI’s push into healthcare got a boost yesterday with Nvidia’s release of the Clara Deploy AI toolkit which includes 13 pre-trained models for use in radiolo Read more…

By John Russell

It’s Official: Aurora on Track to Be First US Exascale Computer in 2021

March 18, 2019

The U.S. Department of Energy along with Intel and Cray confirmed today that an Intel/Cray supercomputer, "Aurora," capable of sustained performance of one exaf Read more…

By Tiffany Trader

Why Nvidia Bought Mellanox: ‘Future Datacenters Will Be…Like High Performance Computers’

March 14, 2019

“Future datacenters of all kinds will be built like high performance computers,” said Nvidia CEO Jensen Huang during a phone briefing on Monday after Nvidia revealed scooping up the high performance networking company Mellanox for $6.9 billion. Read more…

By Tiffany Trader

Oil and Gas Supercloud Clears Out Remaining Knights Landing Inventory: All 38,000 Wafers

March 13, 2019

The McCloud HPC service being built by Australia’s DownUnder GeoSolutions (DUG) outside Houston is set to become the largest oil and gas cloud in the world th Read more…

By Tiffany Trader

Quick Take: Trump’s 2020 Budget Spares DoE-funded HPC but Slams NSF and NIH

March 12, 2019

U.S. President Donald Trump’s 2020 budget request, released yesterday, proposes deep cuts in many science programs but seems to spare HPC funding by the Depar Read more…

By John Russell

Nvidia Wins Mellanox Stakes for $6.9 Billion

March 11, 2019

The long-rumored acquisition of Mellanox came to fruition this morning with GPU chipmaker Nvidia’s announcement that it has purchased the high-performance net Read more…

By Doug Black

Quantum Computing Will Never Work

November 27, 2018

Amid the gush of money and enthusiastic predictions being thrown at quantum computing comes a proposed cold shower in the form of an essay by physicist Mikhail Read more…

By John Russell

The Case Against ‘The Case Against Quantum Computing’

January 9, 2019

It’s not easy to be a physicist. Richard Feynman (basically the Jimi Hendrix of physicists) once said: “The first principle is that you must not fool yourse Read more…

By Ben Criger

ClusterVision in Bankruptcy, Fate Uncertain

February 13, 2019

ClusterVision, European HPC specialists that have built and installed over 20 Top500-ranked systems in their nearly 17-year history, appear to be in the midst o Read more…

By Tiffany Trader

Why Nvidia Bought Mellanox: ‘Future Datacenters Will Be…Like High Performance Computers’

March 14, 2019

“Future datacenters of all kinds will be built like high performance computers,” said Nvidia CEO Jensen Huang during a phone briefing on Monday after Nvidia revealed scooping up the high performance networking company Mellanox for $6.9 billion. Read more…

By Tiffany Trader

Intel Reportedly in $6B Bid for Mellanox

January 30, 2019

The latest rumors and reports around an acquisition of Mellanox focus on Intel, which has reportedly offered a $6 billion bid for the high performance interconn Read more…

By Doug Black

Looking for Light Reading? NSF-backed ‘Comic Books’ Tackle Quantum Computing

January 28, 2019

Still baffled by quantum computing? How about turning to comic books (graphic novels for the well-read among you) for some clarity and a little humor on QC. The Read more…

By John Russell

Contract Signed for New Finnish Supercomputer

December 13, 2018

After the official contract signing yesterday, configuration details were made public for the new BullSequana system that the Finnish IT Center for Science (CSC Read more…

By Tiffany Trader

It’s Official: Aurora on Track to Be First US Exascale Computer in 2021

March 18, 2019

The U.S. Department of Energy along with Intel and Cray confirmed today that an Intel/Cray supercomputer, "Aurora," capable of sustained performance of one exaf Read more…

By Tiffany Trader

Leading Solution Providers

SC 18 Virtual Booth Video Tour

Advania @ SC18 AMD @ SC18
ASRock Rack @ SC18
DDN Storage @ SC18
HPE @ SC18
IBM @ SC18
Lenovo @ SC18 Mellanox Technologies @ SC18
One Stop Systems @ SC18
Oracle @ SC18 Panasas @ SC18
Supermicro @ SC18 SUSE @ SC18 TYAN @ SC18
Verne Global @ SC18

Deep500: ETH Researchers Introduce New Deep Learning Benchmark for HPC

February 5, 2019

ETH researchers have developed a new deep learning benchmarking environment – Deep500 – they say is “the first distributed and reproducible benchmarking s Read more…

By John Russell

IBM Quantum Update: Q System One Launch, New Collaborators, and QC Center Plans

January 10, 2019

IBM made three significant quantum computing announcements at CES this week. One was introduction of IBM Q System One; it’s really the integration of IBM’s Read more…

By John Russell

IBM Bets $2B Seeking 1000X AI Hardware Performance Boost

February 7, 2019

For now, AI systems are mostly machine learning-based and “narrow” – powerful as they are by today's standards, they're limited to performing a few, narro Read more…

By Doug Black

The Deep500 – Researchers Tackle an HPC Benchmark for Deep Learning

January 7, 2019

How do you know if an HPC system, particularly a larger-scale system, is well-suited for deep learning workloads? Today, that’s not an easy question to answer Read more…

By John Russell

HPC Reflections and (Mostly Hopeful) Predictions

December 19, 2018

So much ‘spaghetti’ gets tossed on walls by the technology community (vendors and researchers) to see what sticks that it is often difficult to peer through Read more…

By John Russell

Arm Unveils Neoverse N1 Platform with up to 128-Cores

February 20, 2019

Following on its Neoverse roadmap announcement last October, Arm today revealed its next-gen Neoverse microarchitecture with compute and throughput-optimized si Read more…

By Tiffany Trader

Move Over Lustre & Spectrum Scale – Here Comes BeeGFS?

November 26, 2018

Is BeeGFS – the parallel file system with European roots – on a path to compete with Lustre and Spectrum Scale worldwide in HPC environments? Frank Herold Read more…

By John Russell

France to Deploy AI-Focused Supercomputer: Jean Zay

January 22, 2019

HPE announced today that it won the contract to build a supercomputer that will drive France’s AI and HPC efforts. The computer will be part of GENCI, the Fre Read more…

By Tiffany Trader

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This