Google Claims Its TPU v4 Outperforms Nvidia A100

By Jaime Hampton

April 6, 2023

A new scientific paper from Google details the performance of its Cloud TPU v4 supercomputing platform, claiming it provides exascale performance for machine learning with boosted efficiency.

The authors of the research paper claim the TPU v4 is 1.2x–1.7x faster and uses 1.3x–1.9x less power than the Nvidia A100 in similar sized systems. The paper notes that Google has not compared TPU v4 to the newer Nvidia H100 GPUs due to their limited availability and 4nm architecture (vs. TPU v4’s 7nm architecture).

As machine learning models have grown larger and more complex, so have their compute resource needs. Google’s Tensor Processing Units (TPUs) are specialized hardware accelerators used for building machine learning models, specifically deep neural networks. They are optimized for tensor operations and can significantly boost efficiency in the training and inference of large-scale ML models. Google says the performance, scalability, and availability make TPU supercomputers the workhorses of its large language models like LaMDA, MUM, and PaLM.

The TPU v4 supercomputer contains 4,096 chips interconnected via proprietary optical circuit switches (OCS), which Google claims are faster, cheaper, and utilize less power than InfiniBand, another popular interconnect technology. Google claims its OCS technology is less than 5% of the TPU v4’s system cost and power, stating it dynamically reconfigures the supercomputer interconnect topology to improve scale, availability, utilization, modularity, deployment, security, power, and performance.

Source: Google

Google engineers and paper authors Norm Jouppi and David Patterson explained in a blog post that thanks to key innovations in interconnect technologies and domain-specific accelerators (DSAs), Google Cloud TPU v4 enabled a nearly 10x leap in scaling ML system performance over TPU v3. It also boosted the energy efficiency by approximately 2-3x compared to contemporary ML DSAs and reduced CO2e by approximately 20x over DSAs in what the company calls typical on-prem datacenters.

The TPU v4 system has been operational at Google since 2020. The TPU v4 chip was unveiled at the company’s 2021 I/O developer conference. Google says the supercomputers are actively used by leading AI teams for ML research and production across language models, recommender systems, and other generative AI.

Regarding recommender systems, Google says its TPU supercomputers are also the first with hardware support for embeddings, a key component of Deep Learning Recommendation Models (DLRMs) used in advertising, search ranking, YouTube, and Google Play. This is because each TPU v4 is equipped with SparseCores, which are dataflow processors that accelerate models that rely on embeddings by 5x–7x but use only 5% of die area and power.

One-eighth of a TPU v4 pod from Google’s ML cluster located in Oklahoma, which the company claims runs on ~90% carbon-free energy. (Source: Google)

Midjourney, a text-to-image AI startup, recently selected TPU v4 to train the fourth version of its image-generating model: “We’re proud to work with Google Cloud to deliver a seamless experience for our creative community powered by Google’s globally scalable infrastructure,” said David Holz, founder and CEO of Midjourney in a Google blog post. “From training the fourth version of our algorithm on the latest v4 TPUs with JAX, to running inference on GPUs, we have been impressed by the speed at which TPU v4 allows our users to bring their vibrant ideas to life.”

TPU v4 supercomputers are available to AI researchers and developers at Google Cloud’s ML cluster in Oklahoma, which opened last year. At nine exaflops of peak aggregate performance, Google believes the cluster is the largest publicly available ML hub that operates with 90% carbon-free energy. Check out the TPU v4 research paper here.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Quantum Market, Though Small, will Grow 22% and Hit $1.5B in 2026

December 7, 2023

Few markets as small as the quantum information sciences market generate as much lively discussion. Hyperion Research pegged the worldwide quantum market at $848 million for 2023 and expects it to reach ~$1.5 billion in Read more…

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed its new Instinct MI300X GPU is the fastest AI chip in the worl Read more…

Finding Opportunity in the High-Growth “AI Market” 

December 6, 2023

 “What’s the size of the AI market?” It’s a totally normal question for anyone to ask me. After all, I’m an analyst, and my company, Intersect360 Research, specializes in scalable, high-performance datacenter Read more…

Imagine a Beowulf Cluster of SuperNODEs …
(They did)

December 6, 2023

Clustering resources for faster performance is not new. In the early days of clustering, the Beowulf project demonstrated that high performance was achievable from commodity hardware. These days, the "Beowulf cluster mem Read more…

The IBM-Meta AI Alliance Promotes Safe and Open AI Progress

December 5, 2023

IBM and Meta have co-launched a massive industry-academic-government alliance to shepherd AI development. The new group has united under the AI Alliance banner to promote responsible innovation in AI. Historically, techn Read more…

AWS Solution Channel

Shutterstock 2030529413

Reezocar Rethinks Car Buying Using Computer Vision and ML on AWS

Overview

Every car that finds its way to a landfill marks another dent in the fight for a sustainable future. Reezocar, an online hub for buying and selling used cars, has a mission to change this. Read more…

QCT Solution Channel

QCT and Intel Codeveloped QCT DevCloud Program to Jumpstart HPC and AI Development

Organizations and developers face a variety of issues in developing and testing HPC and AI applications. Challenges they face can range from simply having access to a wide variety of hardware, frameworks, and toolkits to time spent on installation, development, testing, and troubleshooting which can lead to increases in cost. Read more…

ChatGPT Friendly Programming Languages
(hello-world.llm)

December 4, 2023

 Using OpenAI's ChatGPT to write code is an alluring goal. Describing "what to" solve, but not "how to solve" would be a huge breakthrough in computer programming. Alas, we are nowhere near this capability. In particula Read more…

Quantum Market, Though Small, will Grow 22% and Hit $1.5B in 2026

December 7, 2023

Few markets as small as the quantum information sciences market generate as much lively discussion. Hyperion Research pegged the worldwide quantum market at $84 Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Finding Opportunity in the High-Growth “AI Market” 

December 6, 2023

 “What’s the size of the AI market?” It’s a totally normal question for anyone to ask me. After all, I’m an analyst, and my company, Intersect360 Res Read more…

Imagine a Beowulf Cluster of SuperNODEs …
(They did)

December 6, 2023

Clustering resources for faster performance is not new. In the early days of clustering, the Beowulf project demonstrated that high performance was achievable f Read more…

The IBM-Meta AI Alliance Promotes Safe and Open AI Progress

December 5, 2023

IBM and Meta have co-launched a massive industry-academic-government alliance to shepherd AI development. The new group has united under the AI Alliance banner Read more…

Shutterstock 1336284338

ChatGPT Friendly Programming Languages
(hello-world.llm)

December 4, 2023

 Using OpenAI's ChatGPT to write code is an alluring goal. Describing "what to" solve, but not "how to solve" would be a huge breakthrough in computer programm Read more…

IBM Quantum Summit: Two New QPUs, Upgraded Qiskit, 10-year Roadmap and More

December 4, 2023

IBM kicks off its annual Quantum Summit today and will announce a broad range of advances including its much-anticipated 1121-qubit Condor QPU, a smaller 133-qu Read more…

The Annual SCinet Mandala

November 30, 2023

Perhaps you have seen images of Tibetan Buddhists creating beautiful and intricate images with colored sand. These sand mandalas can take weeks to create, only Read more…

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

Leading Solution Providers

Contributors

SC23 Booth Videos

Achronix @ SC23
AMD @ SC23
AWS @ SC23
Altair @ SC23
CoolIT @ SC23
Cornelis Networks @ SC23
CoreHive @ SC23
DDC @ SC23
HPE @ SC23 with Justin Hotard
HPE @ SC23 with Trish Damkroger
Intel @ SC23
Intelligent Light @ SC23
Lenovo @ SC23
Penguin Solutions @ SC23
QCT Intel @ SC23
Tyan AMD @ SC23
Tyan Intel @ SC23
HPCwire LIVE from SC23 Playlist

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire