Intel’s Xeon General Manager Talks about Server Chips 

By Agam Shah

January 2, 2024

Intel is talking data-center growth and is done digging graves for its dead enterprise products, including GPUs, storage, and networking products, which fell to downsizing and cost cuts. 

The chip maker’s 5th Gen Xeon, called Emerald Rapids, is the most advanced server chip released by the company, and it came on time.  

Xeon chips are money spinners, and Intel will go bank on Emerald Rapids. The chip’s demand is expected to be healthy and will dovetail with the success of 4th Gen server chips Sapphire Rapids, which has already shipped in the millions. 

For CEO Pat Gelsinger, the higher the number of cores in a Xeon, the healthier the margins. Emerald Rapids has a maximum of 64 CPUs, up from 60 cores in 4th Gen chips.

But the 5th Gen Xeon is in a precarious position. It is the last of legacy Xeon designs and a stepping stone to Granite Rapids, its next-generation server chip based on a new architecture and manufacturing process. 

The path of least resistance would be to get Emerald Rapids. But then customers will not have to wait long to get Granite Rapids, which is due shortly after the second quarter of next year. 

So, what do customers do? HPCwire sat down with Lisa Spelman, corporate vice president and general manager for Xeon products at Intel, to get some of those questions answered. The chat extended to include the future of the HPC-focused Xeon Max CPU, the controversial On Demand chip features, and developer support. 

HPCwire: Granite Rapids is a new ground-up server design and isn’t far off from the current fifth-gen Emerald Rapids. How should customers think of each? 

Spelman: Our goal is to continue providing just ever-increasing levels of performance and performance-per-watt for various workloads. We’re telling our customers, “Take your first-gen [Xeon], second-gen [Xeon] stuff, move it to 5th Gen, and you’re going to fundamentally lower your operating expenses because you have taken down the number of racks to fulfill your standard workloads.” 

Then, you either absorb the Opex savings or take that power and space you just saved and reinvest it in your AI growth workloads, so there is no bad outcome here. 

When we think about Granite, the next generation that we’re delivering is going to be focused on high-performance computing. I don’t mean just HPC, but AI, the most demanding workloads with advancements in memory bandwidth and capacity. There’s room for both – we don’t see them as actually competing against each other but complementing each other. That’s a lot of the conversation that we’re having with customers right now. 

Intel 2024 Xeon Road Map (Courtesy Intel)

HPCwire: Is the message – if you need more performance, go to Granite Rapids, and if you need more power efficiency and cost-savings, go to Emerald Rapids? 

Spelman: For 5th Gen Xeon, certain workloads, the improvement in the cache made a difference, and the increase in the AMX frequencies also made a difference. 

There’s a 21% generalized performance improvement across a broad variety of workloads, higher on the per-watt improvements, and then in certain workloads, even more. It’s the combination of all those things. 

When you look at Granite, there will be a much more significant leap in core counts to make sure that memory not only keeps up but also gives more headroom. That’s why I say Granite will be focused on higher-performance computing requirements for applications ready for that. Everything you see for Emerald, you will see in a bigger way on Granite. 

HPCwire: Pat Gelsinger is aggressive in pushing up core counts, which boosts company revenue. And then suddenly, we see Sierra Forest going up from 144 to 288 cores. Is there a lower-cost version of Sierra Forest coming with fewer cores? 

Spelman: We always have the top core count, and then we SKU it out. We will have options that hit lower core counts, but 144 cores are still a lot of cores. 

When I say lower core counts, I don’t mean eight cores; the existing 5th-Gen Xeon will fill some of that space. Eventually, we’ll have some Granite Rapids that hit those lower amounts. 

Sierra Forest is that same socket – it will have just an abundance and [many more] cores – smaller E-cores, more power efficiency. We’ve got customers running it, digging into how it will help them improve the TCO of their service delivery. We’re pretty excited about that one. 

HPCwire: The AI PC projects are an opportunity to move AI processing off your server chips to PCs. Is there any relation between the client and server CPUs on offload or software reuse? 

Spelman: In the last couple of years, we have improved how we’ve set up our organizational structures. Some of it is our move to “prodco” and “fabco” models. Our manufacturing team is growing into its own business, and then you have the product team. We’ve gotten much closer together. My team and the connection with the client team at the architecture level has never been stronger. 

Out of this are born use cases that show continuity across our product lines. We intend to demo work being done on AI PC, hitting breakpoints and kicking to Xeon, and needing Granite at the next breakpoint. It’s nice for Intel to say we’re differentiated, but what we’re pursuing more is the customer value of that seamless flow without needing user intervention. 

HPCwire: Emerald Rapids chips are only in two-socket servers. Is there a demand for, say, eight-socket systems? 

Spelman: A bunch of the cloud service providers — not just on-prem OEMs — have eight-socket instances available. SAP HANA is a very high-value workload. Those mission-critical type applications have longer test and dev-type times before they move. Working with our customers, we collectively didn’t feel like we needed it. 

Also, it’s a fair amount of validation work. So, with the ecosystem, we made the decision that we would deliver it on a fourth-gen Xeon (Sapphire Rapids). We will skip it on 5th Gen in order to free up your resources to make sure that you’re ready for Granite Rapids and Sierra Forrest. That just felt like not only the right call but also a better match with customer deployment timelines. 

HPCwire: There is no version of Xeon Max with Emerald Rapids. What is the future of Max? 

Spelman: I think about it in terms of workloads and use cases versus that exact implementation of the HBM. 

We are still using Max, which actually has lots of good performance and value, especially for high-performance computing use cases. 

But when I look to our next-generation products, we are absolutely doing work to deliver continuous improvement in that same space, but not necessarily through the same integration of the HBM. 

When you look at Granite Rapids and MR-DIMM (Multi-Ranked buffered DIMMs) and things like that, that’s a way we are offering a tremendous amount of memory bandwidth and memory capacity capability in a more consumable form factor. 

You look at the ability to put that in a standard DIMM slot in your same TDP envelope, and a customer can literally choose standard DDR, MR-DIMM, the type of workload. I would say, in summary, [there are] very important workload and performance targets that the team and I continue to pursue and deliver on, but they won’t always be through the same hardware mechanism of integrated HBM. 

HPCwire: Emerald Rapids supports On Demand capability, which has been controversial as a feature-rental service and the remote on/off capabilities on Xeon chips. What is it? 

Spelman: I’m super excited about our On Demand roadmap, and we’ll have more to talk about that through 2024. In 4th and 5th Gen, it is a little bit of training wheels for us. We’re trying out different things like feature activation, additional instances of QAT (QuickAssist Technology), or increasing the size of your SGX (Software Guard Extensions) enclave. It’s been good working with the ecosystem and getting feedback on how we build this capability into our customers, system, infrastructure, and all of that. 

As we look a little further into the roadmap, I can see over a couple of years people putting in capability at the edge, being almost a little bit like the four-to-one network consolidation of years ago — I can just see over this next couple of years, people putting in capability at the edge. That’s not something they want to touch for seven to 10 years, but they’re going to land more AI, more workloads, they’re going to want some new things that are coming in the next generation that we’re going to offer for on demand. We’ve made some adjustments and tweaks; there is a kind of significant market opportunity there. 

HPCwire: What do customers have to pay for in On Demand, and how will it work? Will it be available to cloud providers? 

Spelman: Our thought here is we offer this capability to our customers our systems providers, and then allow them the opportunity to build a business on top of it. That’s why it’s taken a bit of a while to get set up. If it was Intel going just directly to an end customer, we could do that faster, but we don’t get the same ecosystem support. 

The ability to see the OEMs offering this capability, whether it’s to an on-prem data center or an edge data center, they will set up and are setting up their services. [For example] they’ll have a security practice that will offer this. We’ve had a few proof of concepts on different cores, frequency levels, things like that. 

HPCwire: You mentioned software and solutions work – Nvidia has AI software packages for verticals that help adoption. Will you package things for certain verticals similarly? 

Spelman: That is definitely an intention – first, you have to get the horizontal foundation set, then build verticals on top. This fall, we announced their Private AI solution with VMware – they have an Nvidia solution, and now they have a Xeon solution. VMware is very motivated to be part of this, and it’s a great partnership because they have tons of scale. 

We have tons of installed base footprints, and we both want to solve this problem for our customers. That’s all about packaging up software so customers don’t spend days pulling from GitHub – they can just grab through existing VMware solutions. We’re working on that with them, and then we’ll work to build verticals on top. We’ll do the same with open-source Red Hat OpenShift. 

HPCwire: Are you incentivizing your developer ecosystem with revenue sharing to build your vertical presence? 

Spelman: For me, even beyond a revenue-sharing model, one of the lessons Intel has learned and is working on is to make the developer work fun. It is not fun to hunt around GitHub trying to find every bit. We find that if we make it easy to get you started, developers thrive on that type of “give me an hour, and I crushed it.” 

A lot of the generative AI potential is … keeping the brains focused on the cool work. We’re trying to just get that foundation easy, and then we’ll figure out if there are some more advanced things, like you said, like the revenue sharing or whatever that we need to do. 

We have very big companies with tons of software developers — they run a tremendous amount of inference on Xeon. I would like it to get to a point where the developer doesn’t have to spend that much time thinking about hardware. Ideally, they can use Pytorch and have some levels of orchestration that just recognize the infrastructure you have and do not have to stress about it. 

 

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

ISC 2024 Takeaways: Love for Top500, Extending HPC Systems, and Media Bashing

May 23, 2024

The ISC High Performance show is typically about time-to-science, but breakout sessions also focused on Europe's tech sovereignty, server infrastructure, storage, throughput, and new computing technologies. This round Read more…

HPC Pioneer Gordon Bell Passed Away

May 22, 2024

Legendary computer scientist Gordon Bell passed away last Friday at his home in Coronado, CA. He was 89. The New York Times has a nice tribute piece. A long-time pioneer with Digital Equipment Corp, he pushed hard for de Read more…

ISC 2024 — A Few Quantum Gems and Slides from a Packed QC Agenda

May 22, 2024

If you were looking for quantum computing content, ISC 2024 was a good place to be last week — there were around 20 quantum computing related sessions. QC even earned a slide in Kathy Yelick’s opening keynote — Bey Read more…

Atos Outlines Plans to Get Acquired, and a Path Forward

May 21, 2024

Atos – via its subsidiary Eviden – is the second major supercomputer maker outside of HPE, while others have largely dropped out. The lack of integrators and Atos' financial turmoil have the HPC market worried. If Atos goes under, HPE will be the only major option for building large-scale systems. Read more…

Core42 Is Building Its 172 Million-core AI Supercomputer in Texas

May 20, 2024

UAE-based Core42 is building an AI supercomputer with 172 million cores which will become operational later this year. The system, Condor Galaxy 3, was announced earlier this year and will have 192 nodes with Cerebras Read more…

Google Announces Sixth-generation AI Chip, a TPU Called Trillium

May 17, 2024

On Tuesday May 14th, Google announced its sixth-generation TPU (tensor processing unit) called Trillium.  The chip, essentially a TPU v6, is the company's latest weapon in the AI battle with GPU maker Nvidia and clou Read more…

ISC 2024 Takeaways: Love for Top500, Extending HPC Systems, and Media Bashing

May 23, 2024

The ISC High Performance show is typically about time-to-science, but breakout sessions also focused on Europe's tech sovereignty, server infrastructure, storag Read more…

ISC 2024 — A Few Quantum Gems and Slides from a Packed QC Agenda

May 22, 2024

If you were looking for quantum computing content, ISC 2024 was a good place to be last week — there were around 20 quantum computing related sessions. QC eve Read more…

Atos Outlines Plans to Get Acquired, and a Path Forward

May 21, 2024

Atos – via its subsidiary Eviden – is the second major supercomputer maker outside of HPE, while others have largely dropped out. The lack of integrators and Atos' financial turmoil have the HPC market worried. If Atos goes under, HPE will be the only major option for building large-scale systems. Read more…

Google Announces Sixth-generation AI Chip, a TPU Called Trillium

May 17, 2024

On Tuesday May 14th, Google announced its sixth-generation TPU (tensor processing unit) called Trillium.  The chip, essentially a TPU v6, is the company's l Read more…

Europe’s Race towards Quantum-HPC Integration and Quantum Advantage

May 16, 2024

What an interesting panel, Quantum Advantage — Where are We and What is Needed? While the panelists looked slightly weary — their’s was, after all, one of Read more…

The Future of AI in Science

May 15, 2024

AI is one of the most transformative and valuable scientific tools ever developed. By harnessing vast amounts of data and computational power, AI systems can un Read more…

Some Reasons Why Aurora Didn’t Take First Place in the Top500 List

May 15, 2024

The makers of the Aurora supercomputer, which is housed at the Argonne National Laboratory, gave some reasons why the system didn't make the top spot on the Top Read more…

ISC 2024 Keynote: High-precision Computing Will Be a Foundation for AI Models

May 15, 2024

Some scientific computing applications cannot sacrifice accuracy and will always require high-precision computing. Therefore, conventional high-performance c Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Atos Outlines Plans to Get Acquired, and a Path Forward

May 21, 2024

Atos – via its subsidiary Eviden – is the second major supercomputer maker outside of HPE, while others have largely dropped out. The lack of integrators and Atos' financial turmoil have the HPC market worried. If Atos goes under, HPE will be the only major option for building large-scale systems. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Some Reasons Why Aurora Didn’t Take First Place in the Top500 List

May 15, 2024

The makers of the Aurora supercomputer, which is housed at the Argonne National Laboratory, gave some reasons why the system didn't make the top spot on the Top Read more…

Leading Solution Providers

Contributors

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

The NASA Black Hole Plunge

May 7, 2024

We have all thought about it. No one has done it, but now, thanks to HPC, we see what it looks like. Hold on to your feet because NASA has released videos of wh Read more…

Intel Plans Falcon Shores 2 GPU Supercomputing Chip for 2026  

August 8, 2023

Intel is planning to onboard a new version of the Falcon Shores chip in 2026, which is code-named Falcon Shores 2. The new product was announced by CEO Pat Gel Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

How the Chip Industry is Helping a Battery Company

May 8, 2024

Chip companies, once seen as engineering pure plays, are now at the center of geopolitical intrigue. Chip manufacturing firms, especially TSMC and Intel, have b Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire