Google Cloud to Offer Nvidia P4 Graphics Card for Inferencing Tasks

By George Leopold

July 25, 2018

Google continues to add GPU horsepower in tandem with its internally developed deep learning processors to its cloud platform with this week’s announcement that it will soon offer Nvidia’s Tesla P4 accelerators for AI inferencing workloads.

Among many partnership announcements during Google’s annual cloud event, the search giant said it would be the first cloud provider to offer the small form-factor GPUs in Europe and the U.S.

Google announced in May it would join cloud rivals Amazon Web Services and Microsoft Azure in offering Tesla V100 GPUs. The datacenter GPU is aimed at machine learning and high-performance computing workloads.

The partners said Wednesday (July 25) the addition of Tesla P4 responds to customer demand for GPU-accelerated inferencing platforms focused on real-time AI-based services.

Besides the smaller package, Nvidia claimed the P4 graphics card delivers as much as a 40-fold processing boost for training emerging AI models such as understanding speech and live translations. Extremely low latency is a key requirement for these apps, added Parash Kharya, Nvidia’s product marketing chief for accelerated computing.

Nvidia said the P4 handles inference tasks with 1.8 milliseconds of latency. “This unlocks a new wave of AI services previous impossible due to latency limitations,” the GPU specialist claims.

Along with machine learning inference, the partners said the cloud-based graphics processor also handles tasks such as visualization and video transcoding, the process for converting files from one format to another so, among other applications, they can be viewed on different devices.

The Tesla P4 inferencing engine is based on Nvidia’s Pascal architecture and is geared specifically to boost the performance of servers running deep learning workloads.

Nvidia first announced the Tesla P4 in September 2016 as the GPU vendor’s neural network inferencing card. The Pascal architecture includes 2,560 CUDA cores and delivers 5.5 teraflops of single-precision or 22 INT8 tera-operations-per-second of peak speed.

Google was vague about availability, saying only that Nvidia’s P4 GPUs would be “coming soon” to its public cloud.

The Nvidia partnership is the latest in a steady stream of cloud deals announced in recent weeks by Google. Earlier this week, SAP announced it was collaborating with the public cloud vendor and to offer cloud virtual machines supporting in-memory workloads running on the SAP HANA database manager.

“Google Cloud’s mission is to organize [customers’] information, and supercharge it,” CEO Diane Greene said during this week’s company event in San Francisco.

Separately, Google released its third-generation cloud Tensor Processing Units, or TPUs, a move aimed squarely at AI developers using its Auto ML service designed to improve model training. “We know that many [developers] need more flexibility than our APIs were designed for,” said Fei-Fei Li, Google’s chief cloud scientist. “That’s why we developed Auto ML.”

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

NREL ‘Eagle’ Supercomputer to Advance Energy Tech R&D

August 14, 2018

The U.S. Department of Energy (DOE) National Renewable Energy Laboratory (NREL) has contracted with HPE for a new 8-petaflops (peak) supercomputer that will be used to advance early-stage R&D on energy technologies s Read more…

By Tiffany Trader

Training Time Slashed for Deep Learning

August 14, 2018

Fast.ai, an organization offering free courses on deep learning, claimed a new speed record for training a popular image database using Nvidia GPUs running on public cloud infrastructure. A pair of researchers trained Read more…

By George Leopold

CERN Project Sees Orders-of-Magnitude Speedup with AI Approach

August 14, 2018

An award-winning effort at CERN has demonstrated potential to significantly change how the physics based modeling and simulation communities view machine learning. The CERN team demonstrated that AI-based models have the Read more…

By Rob Farber

HPE Extreme Performance Solutions

Introducing the First Integrated System Management Software for HPC Clusters from HPE

How do you manage your complex, growing cluster environments? Answer that big challenge with the new HPC cluster management solution: HPE Performance Cluster Manager. Read more…

IBM Accelerated Insights

Super Problem Solving

You might think that tackling the world’s toughest problems is a job only for superheroes, but at special places such as the Oak Ridge National Laboratory, supercomputers are the real heroes. Read more…

Rigetti Eyes Scaling with 128-Qubit Architecture

August 10, 2018

Rigetti Computing plans to build a 128-qubit quantum computer based on an equivalent quantum processor that leverages emerging hybrid computing algorithms used to test programs and potential applications. Founded in 2 Read more…

By George Leopold

NREL ‘Eagle’ Supercomputer to Advance Energy Tech R&D

August 14, 2018

The U.S. Department of Energy (DOE) National Renewable Energy Laboratory (NREL) has contracted with HPE for a new 8-petaflops (peak) supercomputer that will be Read more…

By Tiffany Trader

CERN Project Sees Orders-of-Magnitude Speedup with AI Approach

August 14, 2018

An award-winning effort at CERN has demonstrated potential to significantly change how the physics based modeling and simulation communities view machine learni Read more…

By Rob Farber

Intel Announces Cooper Lake, Advances AI Strategy

August 9, 2018

Intel's chief datacenter exec Navin Shenoy kicked off the company's Data-Centric Innovation Summit Wednesday, the day-long program devoted to Intel's datacenter Read more…

By Tiffany Trader

SLATE Update: Making Math Libraries Exascale-ready

August 9, 2018

Practically-speaking, achieving exascale computing requires enabling HPC software to effectively use accelerators – mostly GPUs at present – and that remain Read more…

By John Russell

Summertime in Washington: Some Unexpected Advanced Computing News

August 8, 2018

Summertime in Washington DC is known for its heat and humidity. That is why most people get away to either the mountains or the seashore and things slow down. H Read more…

By Alex R. Larzelere

NSF Invests $15 Million in Quantum STAQ

August 7, 2018

Quantum computing development is in full ascent as global backers aim to transcend the limitations of classical computing by leveraging the magical-seeming prop Read more…

By Tiffany Trader

By the Numbers: Cray Would Like Exascale to Be the Icing on the Cake

August 1, 2018

On its earnings call held for investors yesterday, Cray gave an accounting for its latest quarterly financials, offered future guidance and provided an update o Read more…

By Tiffany Trader

Google is First Partner in NIH’s STRIDES Effort to Speed Discovery in the Cloud

July 31, 2018

The National Institutes of Health, with the help of Google, last week launched STRIDES - Science and Technology Research Infrastructure for Discovery, Experimen Read more…

By John Russell

Leading Solution Providers

SC17 Booth Video Tours Playlist

Altair @ SC17

Altair

AMD @ SC17

AMD

ASRock Rack @ SC17

ASRock Rack

CEJN @ SC17

CEJN

DDN Storage @ SC17

DDN Storage

Huawei @ SC17

Huawei

IBM @ SC17

IBM

IBM Power Systems @ SC17

IBM Power Systems

Intel @ SC17

Intel

Lenovo @ SC17

Lenovo

Mellanox Technologies @ SC17

Mellanox Technologies

Microsoft @ SC17

Microsoft

Penguin Computing @ SC17

Penguin Computing

Pure Storage @ SC17

Pure Storage

Supericro @ SC17

Supericro

Tyan @ SC17

Tyan

Univa @ SC17

Univa

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This