Intel’s 7nm Slip Raises Questions About Ponte Vecchio GPU, Aurora Supercomputer

By Tiffany Trader

July 30, 2020

During its second-quarter earnings call, Intel announced a one-year delay of its 7nm process technology, which it says will create an approximate six-month shift for its CPU product timing relative to prior expectations. The primary issue is a defect mode in the 7nm process that resulted in yield degradation, said Intel CEO Bob Swan during the July 23, 2020, earnings call.

“We’ve root-caused the issue, and believe there are no fundamental roadblocks,” said Swan. “But we’ve also invested in contingency plans to hedge against further schedule uncertainty. We’ve mitigated the impact of the process delay on our product schedule by leveraging improvements in design methodology, such as die disaggregation and advanced packaging.”

The delay in 7nm potentially puts a kink in Intel’s plans to stand up the Aurora supercomputer at Argonne National Laboratory on-schedule by the end of 2021. Intel, the prime contractor, is building the machine with HPE/Cray, using Intel’s 10nm “Sapphire Rapids” Xeon CPU and its 7nm “Xe” datacenter GPU, codenamed “Ponte Vecchio.”

As the centerpiece of Aurora, the frontrunner of the U.S. exascale program, Ponte Vecchio would herald Intel’s entry into the datacenter GPU market. The GPU was also intended to be Intel’s first 7nm product.

Ponte Vecchio’s release is now slated for “late 2021 or early 2022” and Intel is working with outside fabs on at least some elements of the GPU.

Bob Swan on the Q2 earnings call:

“We will continue to invest in our future process technology roadmap, but we will be pragmatic and objective in deploying the process technology that delivers the most predictability and performance for our customers, whether that be in our process, external foundry process or a combination of both. Our advanced packaging technologies combined with our disaggregated architecture give us tremendous flexibility to use the process technology that best serves our customers. As an example, our datacenter GPU design, Ponte Vecchio, will now be released in late 2021 or early 2022, utilizing external and internal process technologies combined with our world-leading packaging technologies.”

Swan stated the “first Intel-based 7nm product” would be a client CPU in late 2022 or early 2023.

I suppose “first Intel-based 7nm product” can be parsed creatively if not naturally (with “Intel-based” modifying the process not the product), but it’s looking like at launch Ponte Vecchio, at least the GPU die portion, will not be built on 7nm, certainly not Intel’s node since it won’t be ready until a year later. There is speculation that Intel could shift production to TSMC (or Samsung) as part of its “contingency plans.”

Aurora node design, as presented at SC19 by Intel’s Raja Koduri

Given the Department of Energy’s fixed performance and power targets for Aurora and the timeframe of late 2021 or early 2022, TSMC’s 5nm node is a likely candidate.

In the quote above, Swan references Ponte Vecchio “utilizing external and internal process technologies.” The I/O die and GPU die (and the memory stack) can be implemented on different nodes. AMD does this with its Epyc CPUs; Rome, for example, employs 7nm CPU cores and a 14nm I/O die.

From Swan again:

“Originally the architecture of Ponte Vecchio includes an I/O based die, connectivity, a GPU and some memory tiles, all kind of packaged together. That’s kind of the design of Ponte Vecchio. From the beginning, we would do some of those tiles inside and some of those tiles outside, and again leverage the packaging technology as a proof point of how do we mix and match different designs into one package. So, that was the design from the beginning… that design disaggregation gives us lots of flexibility.

“As we go forward now, we can think about whether we introduce Ponte Vecchio with… I think, I said some of those tiles are inside and outside from the beginning. Now, as we go forward, we can assess whether we swap out one of our tiles for a third-party foundry or not. Again, that’s the beauty and value of this change and design methodology that gives us much more optionality and flexibility. So, in the event there’s a process slip, we can buy something rather than make it all ourselves.”

Swan is putting a positive spin on the GPU’s disaggregated design, but swapping out the GPU compute die, as it seems like Intel will need to do, is not a minor change. The process node directly correlates to the performance and energy targets of the Ponte Vecchio GPU and by extension the Aurora system, wherein the GPU will deliver most of the performance and drive a good portion of the power demand.

Intel reported 10nm Sapphire Rapids is on track to begin shipping in the second half of 2021, and its “Intel-based 7nm datacenter CPU” is on the roadmap for the first half of 2023.

Aurora is central to the United States’ exascale plans. Its current implementation, known as Aurora21, has been positioned as the first U.S. exascale machine, although Intel has not committed publicly to a one-exaflops Linpack target.

Aurora is not the only exascale machine in development in the U.S. Oak Ridge National Lab (with HPE and AMD) is aiming to stand up the 1.5 exaflops (minimum peak) Frontier system along a contemporaneous timeframe (late 2021), and with uncertainties around Intel’s datacenter GPU execution, the odds just increased for Oak Ridge taking the lead in the United States’ exascale rollout. Lawrence Livermore National Lab (also with HPE and AMD) is looking to deploy El Capitan — slated to deliver 2 exaflops peak — one year later in late 2022. All three systems feature HPE’s Cray Shasta architecture.

Aurora was originally conceived as a pre-exascale supercomputer in 2015. The DOE CORAL contract called for a 180 petaflops (peak) machine composed of Intel Xeon Phi Knights Hill processors and second-generation OmniPath fabric technology to be stood up at Argonne in 2018. Plans were scrapped as Intel pulled back on, and eventually canceled, Phi and OmniPath development, and the contract was redefined and expanded.

Announcement of the 7nm delay contrasted with Intel’s strong second-quarter financials. The company’s second-quarter revenue of $19.7 billion was up 20 percent year-over-year. Data-centric revenue grew 34 percent, accounting for 52 percent of total revenue. Profit rose 22 percent to $5.11 billion, as reported in the Wall Street Journal, but stocks plunged 18 percent on news of the delay and lowered Q3 guidance, and at the time of this writing have not recovered.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, code-named Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from its predecessors, including the red-hot H100 and A100 GPUs. Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. While Nvidia may not spring to mind when thinking of the quant Read more…

2024 Winter Classic: Meet the HPE Mentors

March 18, 2024

The latest installment of the 2024 Winter Classic Studio Update Show features our interview with the HPE mentor team who introduced our student teams to the joys (and potential sorrows) of the HPL (LINPACK) and accompany Read more…

Houston We Have a Solution: Addressing the HPC and Tech Talent Gap

March 15, 2024

Generations of Houstonian teachers, counselors, and parents have either worked in the aerospace industry or know people who do - the prospect of entering the field was normalized for boys in 1969 when the Apollo 11 missi Read more…

Apple Buys DarwinAI Deepening its AI Push According to Report

March 14, 2024

Apple has purchased Canadian AI startup DarwinAI according to a Bloomberg report today. Apparently the deal was done early this year but still hasn’t been publicly announced according to the report. Apple is preparing Read more…

Survey of Rapid Training Methods for Neural Networks

March 14, 2024

Artificial neural networks are computing systems with interconnected layers that process and learn from data. During training, neural networks utilize optimization algorithms to iteratively refine their parameters until Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, code-named Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. Wh Read more…

Houston We Have a Solution: Addressing the HPC and Tech Talent Gap

March 15, 2024

Generations of Houstonian teachers, counselors, and parents have either worked in the aerospace industry or know people who do - the prospect of entering the fi Read more…

Survey of Rapid Training Methods for Neural Networks

March 14, 2024

Artificial neural networks are computing systems with interconnected layers that process and learn from data. During training, neural networks utilize optimizat Read more…

PASQAL Issues Roadmap to 10,000 Qubits in 2026 and Fault Tolerance in 2028

March 13, 2024

Paris-based PASQAL, a developer of neutral atom-based quantum computers, yesterday issued a roadmap for delivering systems with 10,000 physical qubits in 2026 a Read more…

India Is an AI Powerhouse Waiting to Happen, but Challenges Await

March 12, 2024

The Indian government is pushing full speed ahead to make the country an attractive technology base, especially in the hot fields of AI and semiconductors, but Read more…

Charles Tahan Exits National Quantum Coordination Office

March 12, 2024

(March 1, 2024) My first official day at the White House Office of Science and Technology Policy (OSTP) was June 15, 2020, during the depths of the COVID-19 loc Read more…

AI Bias In the Spotlight On International Women’s Day

March 11, 2024

What impact does AI bias have on women and girls? What can people do to increase female participation in the AI field? These are some of the questions the tech Read more…

Alibaba Shuts Down its Quantum Computing Effort

November 30, 2023

In case you missed it, China’s e-commerce giant Alibaba has shut down its quantum computing research effort. It’s not entirely clear what drove the change. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Analyst Panel Says Take the Quantum Computing Plunge Now…

November 27, 2023

Should you start exploring quantum computing? Yes, said a panel of analysts convened at Tabor Communications HPC and AI on Wall Street conference earlier this y Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Leading Solution Providers

Contributors

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Training of 1-Trillion Parameter Scientific AI Begins

November 13, 2023

A US national lab has started training a massive AI brain that could ultimately become the must-have computing resource for scientific researchers. Argonne N Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Google Introduces ‘Hypercomputer’ to Its AI Infrastructure

December 11, 2023

Google ran out of monikers to describe its new AI system released on December 7. Supercomputer perhaps wasn't an apt description, so it settled on Hypercomputer Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire