Ahead of SC22 (next week!), Intel today announced a major rebrand of its forthcoming datacenter-focused products. In short: the fourth-generation Xeon CPU with high-bandwidth memory (codenamed “Sapphire Rapids with HBM”) is now known as the “Intel Xeon CPU Max Series.” Meanwhile, Intel’s first-generation datacenter GPU (codenamed “Ponte Vecchio”) is now the “Intel Data Center GPU Max Series.” Intel is presenting these “Max Series” products as a unified, horizontal product family designed to serve the needs of the HPC community.
As part of the rebrand – and as the release dates ostensibly draw near – Intel highlighted specs and early performance metrics for its Max Series. The Xeon CPU Max Series, they said, delivered significant power savings relative to current-generation the AMD Milan-X CPU, consuming 68% less power than a Milan-X-based cluster at the same performance level (HPCG); it also performed up to 4.8× faster than the Milan-X cluster on real world workloads (with a floor of 1.2× on the shared benchmarks). The Max Series CPU comes in three memory configurations: HBM-only, HBM flat (two memory regions, HBM and DDR) and HBM caching (which uses HBM as cache for DDR). All of the configurations come with 64GB of HBM2e memory, offering around 1TB/s of memory bandwidth.
The Data Center GPU Max Series, meanwhile, showed a ~1.5-2.4× lead over Nvidia’s current-generation A100 GPU in Intel’s testing across different workloads.
Intel also took the opportunity to outline the form factors for the Max Series GPUs. The GPUs will come in three flavors:
- The PCIe-based Max Series 1100 GPU, weighing in at a 300W TDP and equipped with 56 Xe cores and 48GB of HBM2e memory.
- The OAM-based Max Series 1350 GPU – 450W TDP, 112 Xe cores, 96GB HBM memory.
- Finally, the OAM-based Max Series 1550 GPU, with a whopping 600W TDP, 128 Xe cores and 128GB of HBM memory.
The OAM form factors are two-stack GPUs, in Intel parlance, while the PCIe form factor is a one-stack GPU. The OAM form factors weigh in at 52 peak teraflops FP64, according to Intel, while the PCIe form factor, with the single GPU device inside, halves that peak performance.
Intel also announced the “Intel Data Center GPU Max Series subsystem,” which carries four of the OAM boards – either the 1350 or the 1550 – and leverages Intel Xe Link. Intel highlighted partners for its Max Series CPU including Atos, Dell, Fujitsu, Gigabyte, HPE, Hyve, Inspur, Lenovo, QCT and Supermicro; for its Max Series GPU, the company said it was looking forward to more than 15 system designs from companies including Atos, Dell, HPE, Inspur, Lenovo and Supermicro.
Availability for the Max Series products has been a bit of a moving target, with both the Max Series CPU and the Max Series GPU encountering delays. In a press pre-briefing, Intel said that it will hold a launch event for the Max Series CPU in “very early January” 2023 (which should be January 10th, though OEMs might operate on a slightly different timeline), while the Max Series GPU is expected to become available to more customers in the “early Q2 timeframe.”
Ahead of that, of course: the flagship installation of the Max Series products, Argonne National Laboratory’s exascale Aurora supercomputer. “On the Max Series GPU line, our main focus right now is still to deliver out the full Aurora system,” said Jeff McVeigh, corporate vice president and general manager for Intel’s Super Compute Group, in the prebriefing. “And that’s where we’re putting all of our supply and energy right now.” (“Just as a little spoiler alert, we will not have a Top500 run prior to SC22,” he added. “We’re eager to do that in 2023.”) McVeigh also took a moment to highlight the bandwidth across Aurora’s 200 DAOS storage servers, saying that those machines (only a part of Aurora’s final storage server configuration) deliver six tebibytes a second of bandwidth, about double the prior best IO500 run.
Intel also highlighted early users for its Max Series products across other national labs – Lawrence Livermore (where they will power CTS-2 systems), Los Alamos (the Crossroads system) and Sandia (more CTS-2 systems) – as well as Kyoto University (the Camphor3 system).
Looking further into the future, Intel suggested that future products will use the same nomenclature, including its forthcoming Rialto Bridge GPU (which, as of now, must still be distinguished by its codename). Intel did not confirm whether it expects to have “Max Series” variants for its fifth- and sixth-generation (Emerald Rapids and Granite Rapids) Xeon CPUs. McVeigh also highlighted the importance of the Falcon Shores XPU, a combined CPU-GPU chip akin to Nvidia’s Grace Hopper Superchip. Portraying its chip journey as a mountain, McVeigh said: “We feel like we’ve addressed many of the key obstacles to get to the top, but not maybe that key headwall at the very pinnacle.” The XPU, he said, was that peak.
“Organizing both CPU and GPU under the same Max branding is consistent with the XPU strategy Intel has laid out over the past few years, integrating all forms of computing into a single, cohesive strategy,” commented Addison Snell, CEO of Intersect360 Research. “It’s also consistent with the trend we’ve been tracking toward vendor specificity. Soon HPC users will be choosing between all-Intel, all-AMD, or all-Nvidia environments.”