Intel Customizing Granite Rapids Server Chips for Nvidia GPUs

By Agam Shah

September 25, 2024

Intel is now customizing its latest Xeon 6 server chips for use with Nvidia’s GPUs that dominate the AI landscape. The chipmaker’s new Xeon 6 chips, also called Granite Rapids, have been customized and validated specifically for server boxes with Nvidia’s latest and upcoming GPUs.

“Nvidia is the leader on the GPU side…so we’re partnering closely with them to make sure that people deploying MGX or HGX-based systems, we have a full suite of CPUs that have been qualified together with Nvidia for those systems,” said Ronak Singhal, senior fellow at Intel.

Amid financial struggles, Intel is repositioning its business around x86 CPUs. One way to sell more Granite Rapids chips is to follow the coattails of Nvidia’s red-hot GPUs.

“This is really just the beginning of some of the collaboration we’re doing with Nvidia over the course of the next year. You’ll see more from us as we talk about ways that we’ve optimized some of these SKUs very specifically for this use case,” Singhal said.

It’s a shocking reversal of roles. There was a time when Intel’s server chips ruled the roost while Nvidia’s GPUs relied heavily on CPU sales. Intel is now playing second fiddle to Nvidia.

Intel Xeon 6 is the new name for Granite Rapids (P-cores) and Sierra Forest (E-cores) processors

Role of Xeon 6 with GPUs

Intel’s beefier Xeon 6 6900P chips, announced this week, have up to 128 cores, double that of its previous generation Emerald Rapids and Sapphire Rapids chips.

The Granite Rapids chips are based on chiplets, which allows Intel to mix and match computing capabilities based on customer requirements. The CPU is important for preprocessing and “making sure that you’re not bottlenecking the GPU,” Singhal said.

With its survival at stake, Intel has shown a lot more flexibility in customizing server chips. It recently announced it would customize Xeon 6 chips for Amazon Web Services and hinted at customizing chips for Google Cloud.

A majority of CPUs in data centers are based on Xeon, giving it an early advantage. Enterprise workloads are also based on x86 architecture.

Intel offers an alternative to Nvidia’s proprietary server system, which includes its Grace CPUs, GPUs, and networking infrastructure.

Granite Rapids can be a host CPU for customers who want to build their own Nvidia infrastructure. Customers can select their own multi-way boxes with Nvidia GPUs, memory, or I/O.

Granite Rapids – Not the Server Savior

Until earlier this year, Intel projected that Granite Rapids would give it momentum in servers, but that’s not the case.

The stronger server cycle didn’t materialize, and the “AI build” market is depressing the market.

“Where we still haven’t completely gotten the business to a good place is on the data center side of CPU,” said Intel chief financial officer Dave Zinsner this month during an interview at Citi Global’s analyst day.

Intel’s position on CPUs is stronger than on GPUs – and the successor to Granite Rapids, called Diamond Rapids, could put Intel on top, Zinsner said.

“I think Granite [Rapids] is a meaningful step forward for us in terms of making us competitive. Diamond Rapids will definitely put us in a good place competitively. It’s just important we work our way through the roadmap to get us to where we want to be,” Zinsner said.

The Guts of Granite Rapids

The Granite Rapids chip has two main computing chiplets. The main compute tile is made using the Intel 3 process, while the I/O tile is made using the Intel 7 process.

The top-line 6900P with 128 cores has 12 memory channels. It has DDR5 memory and supports a new memory type called MR-DIMM, which provides up to 2.3 times more memory bandwidth than the 5th Gen Xeon chips.

AI typically requires high memory capacity and bandwidth, which MR-DIMM addresses.

Intel Granite Ridge (Source: Intel)

Intel claims Granite Rapids offers two times more cores per socket, a 1.2 times performance improvement per core, and 1.6 times better performance per watt. The L3 cache is as large as 504MB.

Granite Rapids can do AI capabilities on its own before it offloads work to Nvidia GPUs. For example, the AMX capabilities now support the FP16 data type.

The chip offers a total bandwidth of 24GT/s via six UPI 2.0 links. The chip also has the new AVX2 instruction set.

The Granite Rapids chips have the P-core, which is the performance core. It differs from another Xeon 6 chip called Sierra Forest, launched earlier this year with cores that consume less power but aren’t as fast and are built around the more efficient E-core.

CXL 2.0 — More Memory, But Slow

Granite Rapids also supports CXL 2.0, which provides access to larger memory pools, be it DDR4, DDR5, or HBM.

But there are caveats – access to the memory pool is slower, which is a major disadvantage. It could be useless for AI, which may stick to on-chip or internal HBM memory. However, despite the lower latency issue, it still provides access to a larger memory pool in a wider cluster, which isn’t always a disadvantage.

For example, workloads that are not urgent could be offloaded to DDR4. Intel calls the technology flat-memory mode.

“CXL gives me flexibility as to what memory type I might be using depending on the device that I’m using,” Singhal said.

The future of CXL 2.0 has been up in the air, lacking Nvidia’s support. But Singhal said he is now seeing large-scale deployments, which is a good sign for the technology’s adoption.

“Customers are able to get a lower spend on their memory side by taking advantage of our flat memory mode. We’re pretty excited about this,” Singhal said.

Granite Rapids Basics

The five new Xeon 6 6900P chips for two-socket systems have between 96 to 128 cores and draw up to 500 watts of power.

The base frequency ranges from 2.0GHz to 3.9GHz in turbo mode. Intel didn’t share pricing. Newer versions of the chip with fewer cores will come out in the first quarter of next year. A quick look at Granite Rapids HPC benchmarks is available in a recent HPCwire article: Granite Rapids HPC Benchmarks: I’m Thinking Intel is Back

Intel’s New AI Chip for GPU Poors

Additionally, Intel is also providing its own Gaudi 3 AI accelerator, which was also announced. Intel is targeting Gaudi 3 at AI inferencing and isn’t going after the AI training market, which it has conceded to Nvidia, AMD, and Google, which has its TPUs.

Typically, ASICs can take on workloads only tuned to the chip, which is unlike GPUs. But Gaudi 3 has default features and instructions that allow it to take on many AI workloads, though the performance may not be optimal.

The AI chip has 64 Tensor processor cores and eight matrix multiplication engines. The chip was made using the 5-nanometer process. Gaudi 3 is behind GPUs on memory. It has 128GB of HBM2e memory, while Nvidia and AMD GPUs have graduated to HBM3E.

Gaudi 3 is available through IBM Cloud and will be used in servers from companies that include Dell and Supermicro.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

AMD Announces Flurry of New Chips

October 10, 2024

AMD today announced several new chips including its newest Instinct GPU — the MI325X — as it chases Nvidia. Other new devices announced at the company event in San Francisco included the 5th Gen AMD EPYC processors, Read more…

NSF Grants $107,600 to English Professors to Research Aurora Supercomputer

October 9, 2024

The National Science Foundation has granted $107,600 to English professors at US universities to unearth the mysteries of the Aurora supercomputer. The two-year grant recipients will write up what the Aurora supercompute Read more…

VAST Looks Inward, Outward for An AI Edge

October 9, 2024

There’s no single best way to respond to the explosion of data and AI. Sometimes you need to bring everything into your own unified platform. Other times, you lean on friends and neighbors to chart a way forward. Those Read more…

Google Reports Progress on Quantum Devices beyond Supercomputer Capability

October 9, 2024

A Google-led team of researchers has presented more evidence that it’s possible to run productive circuits on today’s near-term intermediate scale quantum devices that are beyond the reach of classical computing. � Read more…

At 50, Foxconn Celebrates Graduation from Connectors to AI Supercomputing

October 8, 2024

Foxconn is celebrating its 50th birthday this year. It started by making connectors, then moved to systems, and now, a supercomputer. The company announced it would build the supercomputer with Nvidia's Blackwell GPUs an Read more…

ZLUDA Takes Third Wack as a CUDA Emulator

October 7, 2024

The ZLUDA CUDA emulator is back in its third invocation. At one point, the project was quietly funded by AMD and demonstrated the ability to run unmodified CUDA applications with near-native performance on AMD GPUs. Cons Read more…

NSF Grants $107,600 to English Professors to Research Aurora Supercomputer

October 9, 2024

The National Science Foundation has granted $107,600 to English professors at US universities to unearth the mysteries of the Aurora supercomputer. The two-year Read more…

VAST Looks Inward, Outward for An AI Edge

October 9, 2024

There’s no single best way to respond to the explosion of data and AI. Sometimes you need to bring everything into your own unified platform. Other times, you Read more…

Google Reports Progress on Quantum Devices beyond Supercomputer Capability

October 9, 2024

A Google-led team of researchers has presented more evidence that it’s possible to run productive circuits on today’s near-term intermediate scale quantum d Read more…

At 50, Foxconn Celebrates Graduation from Connectors to AI Supercomputing

October 8, 2024

Foxconn is celebrating its 50th birthday this year. It started by making connectors, then moved to systems, and now, a supercomputer. The company announced it w Read more…

The New MLPerf Storage Benchmark Runs Without ML Accelerators

October 3, 2024

MLCommons is known for its independent Machine Learning (ML) benchmarks. These benchmarks have focused on mathematical ML operations and accelerators (e.g., Nvi Read more…

DataPelago Unveils Universal Engine to Unite Big Data, Advanced Analytics, HPC, and AI Workloads

October 3, 2024

DataPelago this week emerged from stealth with a new virtualization layer that it says will allow users to move AI, data analytics, and ETL workloads to whateve Read more…

Stayin’ Alive: Intel’s Falcon Shores GPU Will Survive Restructuring

October 2, 2024

Intel's upcoming Falcon Shores GPU will survive the brutal cost-cutting measures as part of its "next phase of transformation." An Intel spokeswoman confirmed t Read more…

How GenAI Will Impact Jobs In the Real World

September 30, 2024

There’s been a lot of fear, uncertainty, and doubt (FUD) about the potential for generative AI to take people’s jobs. The capability of large language model Read more…

Shutterstock_2176157037

Intel’s Falcon Shores Future Looks Bleak as It Concedes AI Training to GPU Rivals

September 17, 2024

Intel's Falcon Shores future looks bleak as it concedes AI training to GPU rivals On Monday, Intel sent a letter to employees detailing its comeback plan after Read more…

Granite Rapids HPC Benchmarks: I’m Thinking Intel Is Back (Updated)

September 25, 2024

Waiting is the hardest part. In the fall of 2023, HPCwire wrote about the new diverging Xeon processor strategy from Intel. Instead of a on-size-fits all approa Read more…

Ansys Fluent® Adds AMD Instinct™ MI200 and MI300 Acceleration to Power CFD Simulations

September 23, 2024

Ansys Fluent® is well-known in the commercial computational fluid dynamics (CFD) space and is praised for its versatility as a general-purpose solver. Its impr Read more…

AMD Clears Up Messy GPU Roadmap, Upgrades Chips Annually

June 3, 2024

In the world of AI, there's a desperate search for an alternative to Nvidia's GPUs, and AMD is stepping up to the plate. AMD detailed its updated GPU roadmap, w Read more…

Nvidia Shipped 3.76 Million Data-center GPUs in 2023, According to Study

June 10, 2024

Nvidia had an explosive 2023 in data-center GPU shipments, which totaled roughly 3.76 million units, according to a study conducted by semiconductor analyst fir Read more…

Shutterstock_1687123447

Nvidia Economics: Make $5-$7 for Every $1 Spent on GPUs

June 30, 2024

Nvidia is saying that companies could make $5 to $7 for every $1 invested in GPUs over a four-year period. Customers are investing billions in new Nvidia hardwa Read more…

Shutterstock 1024337068

Researchers Benchmark Nvidia’s GH200 Supercomputing Chips

September 4, 2024

Nvidia is putting its GH200 chips in European supercomputers, and researchers are getting their hands on those systems and releasing research papers with perfor Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Leading Solution Providers

Contributors

IBM Develops New Quantum Benchmarking Tool — Benchpress

September 26, 2024

Benchmarking is an important topic in quantum computing. There’s consensus it’s needed but opinions vary widely on how to go about it. Last week, IBM introd Read more…

Intel Customizing Granite Rapids Server Chips for Nvidia GPUs

September 25, 2024

Intel is now customizing its latest Xeon 6 server chips for use with Nvidia's GPUs that dominate the AI landscape. The chipmaker's new Xeon 6 chips, also called Read more…

Quantum and AI: Navigating the Resource Challenge

September 18, 2024

Rapid advancements in quantum computing are bringing a new era of technological possibilities. However, as quantum technology progresses, there are growing conc Read more…

Google’s DataGemma Tackles AI Hallucination

September 18, 2024

The rapid evolution of large language models (LLMs) has fueled significant advancement in AI, enabling these systems to analyze text, generate summaries, sugges Read more…

IonQ Plots Path to Commercial (Quantum) Advantage

July 2, 2024

IonQ, the trapped ion quantum computing specialist, delivered a progress report last week firming up 2024/25 product goals and reviewing its technology roadmap. Read more…

Microsoft, Quantinuum Use Hybrid Workflow to Simulate Catalyst

September 13, 2024

Microsoft and Quantinuum reported the ability to create 12 logical qubits on Quantinuum's H2 trapped ion system this week and also reported using two logical qu Read more…

US Implements Controls on Quantum Computing and other Technologies

September 27, 2024

Yesterday the Commerce Department announced export controls on quantum computing technologies as well as new controls for advanced semiconductors and additive Read more…

Everyone Except Nvidia Forms Ultra Accelerator Link (UALink) Consortium

May 30, 2024

Consider the GPU. An island of SIMD greatness that makes light work of matrix math. Originally designed to rapidly paint dots on a computer monitor, it was then Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire