Nvidia Sees Bright Future for AI Supercomputing

By Tiffany Trader

November 23, 2016

Graphics chipmaker Nvidia made a strong showing at SC16 in Salt Lake City last week. Most prominent wins were achieving the number one spot on the Green500 list with new in-house DGX-1 supercomputer, SaturnV, and partnering with the National Cancer Institute, the U.S. Department of Energy (DOE) and several national laboratories to accelerate cancer research as part of the Cancer Moonshot initiative.

The company kicked off its SC activities with a press briefing on Monday (Nov. 14), during which CEO Jen-Hsun Huang characterized 2016 as a tipping point for the GPU computing approach popularized by Nvidia for over a decade.

Not surprisingly, Huang’s main message was that the GPU computing era has arrived. Throughout the hour-long talk, Huang would revisit the theme of deep learning as both a supercomputing problem and a supercomputing opportunity.

“We believe that supercomputers ought to be designed as AI supercomputers – meaning it has to be good at both computational science as well as data science – that building a machine that’s only good at data science doesn’t make sense and building a supercomputer that’s only good at computational science doesn’t make sense,” he said.

“On the one hand, deep learning requires an enormous amount of data throughput processing – this way of developing software where the computers write software themselves inspired by a lot of data processing behind it is a very important approach to computing but it also has the wonderful opportunity to benefit supercomputing as well, solving problems for science that hasn’t been possible before today,” said Huang.

Huang’s view is that traditional numerical HPC is not going anywhere, but will exist side by side with machine learning methods.

“I’m a big fan of using math when you can; we should use AI when you can’t,” he said. “For example what’s the equation of a cat? It’s probably very similar to the equation for a dog – two ears, four legs, a tail. And so there are a lot of areas where equations don’t work and that’s where I see AI – search problems, recommendation problems, likelihood problems, where there’s either too much data, incomplete data, or no laws of physics that support it. So where do I feel like eating tonight – there’s no laws of physics for that. There’s a lot of these type of problems that we simply can’t solve – I think that they’re going to coexist.”

While Nvidia is enabling parallel computing via thousands of CUDA cores combined with the CUDA programing framework, the CEO emphasized the necessity of a performant central processing unit. “Almost everything we do we start with a strong CPU,” said Huang. “We still believe in Amdahl’s law; we believe that code has a lot of single threaded parts to it and this is an area that we want to continue to be good at.”

nvidia-nvlink-dgx-1-ibm-p8

The two servers currently shipping with the NVLink P100 GPU – Nvidia’s DGX-1 server and IBM’s Minsky platform – speak to this goal. The DGX-1 connects eight NVLink’d Pascal P100s to two 20-core Intel Xeon E5-2698 v4 chips. The IBM Minsky server leverages two Power8 CPUs and four P100 GPUs connected by NVlink up to the CPUs.

Nvidia’s 124-node supercomputer, SaturnV plays a crucial role in Nvidia’s plans to usher in AI supercomputing. The machine debuted on the 48th TOP500 list at number 28 with 3.3 petaflops Linpack (4.9 petaflops peak). Even more impressively, it nabbed the number one spot on the Green500 list achieving more than 8.17 gigaflops/watt. That’s a 42 percent improvement from the 6.67 gigaflops/watt delivered by the most efficient machine on the previous TOP500 list. Extrapolating to exascale gives us 105.7 MW. If we go with a semi-“relaxed” exascale power allowance of 30 MW (the original DARPA target was 20 MW), this is less than one-fourth the planned power consumption of US exascale systems. Three years ago, the extrapolated delta was over a 7X.

SaturnV – its name inspired by the original Moonshot – will be a critical part of the CANDLE (CANcer Distributed Learning Environment) project (covered here). Announced last month, CANDLE’s mission is to exploit high performance computing (HPC), machine learning and data analytics technologies to advance precision oncology. Huang said the partners will be working together to develop “the world’s first deep learning framework designed for exascale.”

“It’s going to be really hard,” he added. “That’s why we’re working with the four DOE labs and have all standardized on the same architecture – SaturnV is the biggest one of them but we’re all using exactly the same architecture and it’s all GPU accelerated and we’re going to develop a framework that allows us to scale to get to exascale.”

Huang noted that when you apply deep learning FLOPS math – aka 16-bit floating point operations as opposed to the HPC norm of 64-bit FLOPS, exascale is not far away at all.

The [IBM/Nvidia] CORAL machines are on track for 2018 with 300 petaflops peak FP64, which comes out to 1,200 peak FP16, Huang pointed out. “For AI, FP16 is fine, now in some areas we need FP32, we need variable precision, but that’s the point,” he said. “I think CORAL is going to be the world’s fastest AI supercomputer [and] I think that we didn’t know it then but I believe that we are building an exascale machine already.”

It’s a fair point that dialing down the bits increases data throughput (boosting FLOPS), but as one analyst at the event said, “calling it exascale is changing the rules.”

Lending more insight to Nvidia’s plans was Solutions Architect Louis Capps, who presented at the Green500 BoF on November 16.

“This is completely a research platform,” he said of SaturnV. “We’re going to have academics using it. We’re going to have partnerships, collaborations, and internally, we’re working on our deep learning research and our HPC research.”

Embedded, robotics, automotive, and hyperscale computing are all major focus areas, but Capps and Huang both were most effusive about the opportunities at the convergence of data science and HPC. “We’re just now starting to bridge where real HPC work is converging with deep learning,” said Capps.

nvidia_dgx_saturnv-800xSaturnV is organized into five 3U boxes per rack, with 15 kilowatt of power on each rack and some 25 racks total. While the press photo of SaturnV indicates 10 servers per rack, this is not reflective of what’s inside. “We could not put that many in ours,” said Capps. “We put this in a datacenter which is not HPC. It was an IT datacenter originally.”

SaturnV was one of two systems on the newly published TOP500 list to employ the Pascal-based P100 GPUs. The number two greenest super, Piz Daint is using the PCIe variants. Installed at the Swiss National Supercomputing Centre, Piz Daint delivers an energy-efficiency rating of 7.45 gigaflops/watt. Refreshed with the new P100 hardware, Piz Daint achieved 9.8 petaflops on the Linpack benchmark, securing it the eighth spot on the latest list.

Notably, every single one of the top ten systems on the Green500 list is using some flavor of acceleration or manycore. There is no pure-play traditional x86 in the bunch.

green500-nov-2016-top-10
Source: Top500/Green500

A compelling testament to this approach came from Thomas Schulthess, director of the Swiss National Supercomputing Centre, where Nvidia K80 GPUs have been used for operational weather forecasting for over a year now. “I know the HPC community has a problem with the heterogeneous approach,” he said. “We’ve done a lot of analysis on this issue. We asked, what would the goals we have at exascale look like if we build a homogeneous Xeon-based system, and there’s no way that you will run significant problems that are significantly bigger and faster than we do today in 5-6 years at exascale if you build it based on a Xeon system.

“The message to the application folks is, you’ve had time to think about it now, but now there is no more choice. If you want to run at exascale, it is going to be on Xeon Phi or GPU-accelerated or the lightweight core, almost Cell-like architectures that we see on TaihuLight.”

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Ayar Labs to Demo Photonics Chiplet in FPGA Package at Hot Chips

August 19, 2019

Silicon startup Ayar Labs continues to gain momentum with its DARPA-backed optical chiplet technology that puts advanced electronics and optics on the same chip using standard CMOS fabrication. At Hot Chips 31 in Stanfor Read more…

By Tiffany Trader

Talk to Me: Nvidia Claims NLP Inference, Training Records

August 15, 2019

Nvidia says it’s achieved significant advances in conversation natural language processing (NLP) training and inference, enabling more complex, immediate-response interchanges between customers and chatbots. And the co Read more…

By Doug Black

Trump Administration and NIST Issue AI Standards Development Plan

August 14, 2019

Efforts to develop AI are gathering steam fast. On Monday, the White House issued a federal plan to help develop technical standards for AI following up on a mandate contained in the Administration’s AI Executive Order Read more…

By John Russell

AWS Solution Channel

Efficiency and Cost-Optimization for HPC Workloads – AWS Batch and Amazon EC2 Spot Instances

High Performance Computing on AWS leverages the power of cloud computing and the extreme scale it offers to achieve optimal HPC price/performance. With AWS you can right size your services to meet exactly the capacity requirements you need without having to overprovision or compromise capacity. Read more…

HPE Extreme Performance Solutions

Bring the combined power of HPC and AI to your business transformation

FPGA (Field Programmable Gate Array) acceleration cards are not new, as they’ve been commercially available since 1984. Typically, the emphasis around FPGAs has centered on the fact that they’re programmable accelerators, and that they can truly offer workload specific hardware acceleration solutions without requiring custom silicon. Read more…

IBM Accelerated Insights

Cloudy with a Chance of Mainframes

[Connect with HPC users and learn new skills in the IBM Spectrum LSF User Community.]

Rapid rates of change sometimes result in unexpected bedfellows. Read more…

Scientists to Tap Exascale Computing to Unlock the Mystery of our Accelerating Universe

August 14, 2019

The universe and everything in it roared to life with the Big Bang approximately 13.8 billion years ago. It has continued expanding ever since. While we have a good understanding of the early universe, its fate billions Read more…

By Rob Johnson

Ayar Labs to Demo Photonics Chiplet in FPGA Package at Hot Chips

August 19, 2019

Silicon startup Ayar Labs continues to gain momentum with its DARPA-backed optical chiplet technology that puts advanced electronics and optics on the same chip Read more…

By Tiffany Trader

Scientists to Tap Exascale Computing to Unlock the Mystery of our Accelerating Universe

August 14, 2019

The universe and everything in it roared to life with the Big Bang approximately 13.8 billion years ago. It has continued expanding ever since. While we have a Read more…

By Rob Johnson

AI is the Next Exascale – Rick Stevens on What that Means and Why It’s Important

August 13, 2019

Twelve years ago the Department of Energy (DOE) was just beginning to explore what an exascale computing program might look like and what it might accomplish. Today, DOE is repeating that process for AI, once again starting with science community town halls to gather input and stimulate conversation. The town hall program... Read more…

By Tiffany Trader and John Russell

Cray Wins NNSA-Livermore ‘El Capitan’ Exascale Contract

August 13, 2019

Cray has won the bid to build the first exascale supercomputer for the National Nuclear Security Administration (NNSA) and Lawrence Livermore National Laborator Read more…

By Tiffany Trader

AMD Launches Epyc Rome, First 7nm CPU

August 8, 2019

From a gala event at the Palace of Fine Arts in San Francisco yesterday (Aug. 7), AMD launched its second-generation Epyc Rome x86 chips, based on its 7nm proce Read more…

By Tiffany Trader

Lenovo Drives Single-Socket Servers with AMD Epyc Rome CPUs

August 7, 2019

No summer doldrums here. As part of the AMD Epyc Rome launch event in San Francisco today, Lenovo announced two new single-socket servers, the ThinkSystem SR635 Read more…

By Doug Black

Building Diversity and Broader Engagement in the HPC Community

August 7, 2019

Increasing diversity and inclusion in HPC is a community-building effort. Representation of both issues and individuals matters - the more people see HPC in a w Read more…

By AJ Lauer

Xilinx vs. Intel: FPGA Market Leaders Launch Server Accelerator Cards

August 6, 2019

The two FPGA market leaders, Intel and Xilinx, both announced new accelerator cards this week designed to handle specialized, compute-intensive workloads and un Read more…

By Doug Black

High Performance (Potato) Chips

May 5, 2006

In this article, we focus on how Procter & Gamble is using high performance computing to create some common, everyday supermarket products. Tom Lange, a 27-year veteran of the company, tells us how P&G models products, processes and production systems for the betterment of consumer package goods. Read more…

By Michael Feldman

Supercomputer-Powered AI Tackles a Key Fusion Energy Challenge

August 7, 2019

Fusion energy is the Holy Grail of the energy world: low-radioactivity, low-waste, zero-carbon, high-output nuclear power that can run on hydrogen or lithium. T Read more…

By Oliver Peckham

Cray, AMD to Extend DOE’s Exascale Frontier

May 7, 2019

Cray and AMD are coming back to Oak Ridge National Laboratory to partner on the world’s largest and most expensive supercomputer. The Department of Energy’s Read more…

By Tiffany Trader

Graphene Surprises Again, This Time for Quantum Computing

May 8, 2019

Graphene is fascinating stuff with promise for use in a seeming endless number of applications. This month researchers from the University of Vienna and Institu Read more…

By John Russell

AMD Verifies Its Largest 7nm Chip Design in Ten Hours

June 5, 2019

AMD announced last week that its engineers had successfully executed the first physical verification of its largest 7nm chip design – in just ten hours. The AMD Radeon Instinct Vega20 – which boasts 13.2 billion transistors – was tested using a TSMC-certified Calibre nmDRC software platform from Mentor. Read more…

By Oliver Peckham

TSMC and Samsung Moving to 5nm; Whither Moore’s Law?

June 12, 2019

With reports that Taiwan Semiconductor Manufacturing Co. (TMSC) and Samsung are moving quickly to 5nm manufacturing, it’s a good time to again ponder whither goes the venerable Moore’s law. Shrinking feature size has of course been the primary hallmark of achieving Moore’s law... Read more…

By John Russell

Deep Learning Competitors Stalk Nvidia

May 14, 2019

There is no shortage of processing architectures emerging to accelerate deep learning workloads, with two more options emerging this week to challenge GPU leader Nvidia. First, Intel researchers claimed a new deep learning record for image classification on the ResNet-50 convolutional neural network. Separately, Israeli AI chip startup Hailo.ai... Read more…

By George Leopold

Cray Wins NNSA-Livermore ‘El Capitan’ Exascale Contract

August 13, 2019

Cray has won the bid to build the first exascale supercomputer for the National Nuclear Security Administration (NNSA) and Lawrence Livermore National Laborator Read more…

By Tiffany Trader

Leading Solution Providers

ISC 2019 Virtual Booth Video Tour

CRAY
CRAY
DDN
DDN
DELL EMC
DELL EMC
GOOGLE
GOOGLE
ONE STOP SYSTEMS
ONE STOP SYSTEMS
PANASAS
PANASAS
VERNE GLOBAL
VERNE GLOBAL

Nvidia Embraces Arm, Declares Intent to Accelerate All CPU Architectures

June 17, 2019

As the Top500 list was being announced at ISC in Frankfurt today with an upgraded petascale Arm supercomputer in the top third of the list, Nvidia announced its Read more…

By Tiffany Trader

Top500 Purely Petaflops; US Maintains Performance Lead

June 17, 2019

With the kick-off of the International Supercomputing Conference (ISC) in Frankfurt this morning, the 53rd Top500 list made its debut, and this one's for petafl Read more…

By Tiffany Trader

AMD Launches Epyc Rome, First 7nm CPU

August 8, 2019

From a gala event at the Palace of Fine Arts in San Francisco yesterday (Aug. 7), AMD launched its second-generation Epyc Rome x86 chips, based on its 7nm proce Read more…

By Tiffany Trader

A Behind-the-Scenes Look at the Hardware That Powered the Black Hole Image

June 24, 2019

Two months ago, the first-ever image of a black hole took the internet by storm. A team of scientists took years to produce and verify the striking image – an Read more…

By Oliver Peckham

Cray – and the Cray Brand – to Be Positioned at Tip of HPE’s HPC Spear

May 22, 2019

More so than with most acquisitions of this kind, HPE’s purchase of Cray for $1.3 billion, announced last week, seems to have elements of that overused, often Read more…

By Doug Black and Tiffany Trader

Chinese Company Sugon Placed on US ‘Entity List’ After Strong Showing at International Supercomputing Conference

June 26, 2019

After more than a decade of advancing its supercomputing prowess, operating the world’s most powerful supercomputer from June 2013 to June 2018, China is keep Read more…

By Tiffany Trader

In Wake of Nvidia-Mellanox: Xilinx to Acquire Solarflare

April 25, 2019

With echoes of Nvidia’s recent acquisition of Mellanox, FPGA maker Xilinx has announced a definitive agreement to acquire Solarflare Communications, provider Read more…

By Doug Black

Qualcomm Invests in RISC-V Startup SiFive

June 7, 2019

Investors are zeroing in on the open standard RISC-V instruction set architecture and the processor intellectual property being developed by a batch of high-flying chip startups. Last fall, Esperanto Technologies announced a $58 million funding round. Read more…

By George Leopold

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This