Until the launch of Pawsey’s Setonix system last year, NCI’s Gadi system – launched in 2020 – was Australia’s most powerful publicly ranked supercomputer. Now, the system has received a major boost powered by Intel Xeon “Sapphire Rapids” CPUs. The upgrade marks the first phase of a major investment in expanding NCI’s computing capacity.
Gadi’s various segments already contain a smörgåsbord of Intel chips. Previously, the system had: 3,074 nodes with dual (now-prior-gen) Intel “Cascade Lake” CPUs (including 50 nodes with 1.5TB of Intel Optane persistent memory); 804 nodes with dual Intel “Broadwell” CPUs; 192 nodes with dual Intel “Skylake” CPUs; 160 nodes with quadruple Nvidia V100 GPUs (and dual Cascade Lake CPUs); 10 large-memory nodes with dual Broadwell CPUs and 512GB of memory; and two Nvidia DGX A100 nodes, each with octuple Nvidia A100 GPUs. In total: over 4,200 nodes, around 8,500 CPUs and 640 GPUs.
Now, the Fujitsu-supplied addition: 720 nodes with dual 52-core Sapphire Rapids CPUs and 512GB of memory. The nodes (pictured in the header) are networked with Nvidia InfiniBand HDR 200Gb/s. The system now has around 5,000 nodes, 10,000 CPUs and – thanks to the large memory in these new nodes – around a petabyte of memory.
Prior to the upgrade, Gadi delivered 15.14 peak petaflops (9.26 Linpack), and the system ranked 62nd on the most recent Top500. We have yet to see formal performance numbers or benchmarks from major Sapphire Rapids-based systems, but it might be reasonable to expect that Gadi’s aggregate peak performance will jump by at least a few petaflops.
Speaking of the lack of comparatives, Gadi is one of a small number of public large-scale, real-world installations of Sapphire Rapids CPUs in a supercomputer, as far as we’re aware. Sapphire Rapids also features in Tycho, the first phase of the Crossroads supercomputer at Los Alamos National Laboratory (LANL), and Aurora, the long-delayed exascale system at Argonne National Laboratory. Last we heard, Sapphire Rapids chips had been shipped to Argonne, which was slated to begin installing them last fall ahead of a planned replacement of those initial Sapphire Rapids chips with their (also delayed) HBM-equipped “Max Series” siblings some time in the future. Two Aurora test systems also use Sapphire Rapids: Sunspot at Argonne and Borealis at Intel’s facilities in Oregon.
In Europe, Sapphire Rapids powers the data-centric module of Cineca’s Leonardo supercomputer, which should make its first full appearance on the next Top500 list, and they are also slated to provide most of the CPU power of the forthcoming MareNostrum 5 system at the Barcelona Supercomputing Center (BSC), which is currently in the early stages of installation.
“This upgrade to the Gadi supercomputer brings the latest technologies to researchers, speeding up their work and enabling bigger and better scientific advancements,” said Sean Smith, NCI’s director. “NCI is proud to be one of the first facilities in the world to make these processors available to scientists.”
As mentioned before, the Sapphire Rapids nodes are just the first phase of a bigger investment – $40 million AUD (~$26 million USD) – toward expanding NCI’s computing. NCI says that additional GPUs and enhancements to its electric supply (no easy proposition!) are also part of that investment.
“We are committed to advancing Australian high-performance computing, artificial intelligence and data science, enabling research in priority areas including cancer diagnosis, climate simulation and next-generation materials,” Smith said.