With its Radeon “Vega” Instinct datacenter GPUs and EPYC “Naples” server chips entering the market this summer, AMD has positioned itself for a two-headed battle against rivals Intel and Nvidia. AMD took to the SIGGRAPH stage on Sunday to showcase both technologies with the unveiling of the Project 47 supercomputer, developed in partnership with Inventec, Mellanox and Samsung.
Based on Inventec’s P-series computing platform, the P47 rack houses 20 2U servers, each equipped with a single EPYC 7601 processor hooked to four “Vega”-based Radeon Instinct MI25 accelerators. AMD is aiming the one-petaflops of peak single-precision performance at a mix of graphics, machine intelligence and HPC workloads.
With 128 lanes of PCIe per EPYC socket, the four MI25 GPUs can operate at full bandwidth without the need to, as AMD’s Mark Hirsh observes, resort to “costly dual-CPU and PLX switch setups typically needed on competing platforms in order to run four GPUs.” A fully-populated rack boasts “more compute power and more cores, threads, compute units, IO lanes and memory channels in use at one time than in any other similarly configured system ever released,” adds Hirsch, corporate vice president, systems & solutions for AMD’s Radeon Technologies Group, in a blog post.
Samsung contributed 10TB of DDR4 memory, HBM2 for the GPU cards, and high-performance NVMe SSD storage, and Mellanox supplied EDR (100G) InfiniBand connectivity.
A full 20-server rack of P47 systems achieves 30.05 gigaflops per watt in single-precision performance, a number that AMD’s press outreach cited as being “25 percent better compute efficiency than select competing supercomputing platforms.” Given that the P47 system doesn’t offer much in the way of double-precision arithmetic, it’s a potentially misleading claim. We’ll point out that machines from HPE-SGI, NEC, Fujitsu, Exascaler, Dell, Cray and the P100-powered Saturn V from Nvidia achieved between 9.5 and 14.1 Linpack gigaflops per watt on the latest Green500 listing and when we do the math for single-precision peak, they offer between 28-50 gigaflops per watt.
AMD presented two live demonstrations of the new server rack. The first involved remote testing in Autodesk Maya, Blender and Adobe Premiere Pro. The second test used all 80 GPUs to produce a full photorealistic rendering of a motorcycle in about a second. These demos were targeting the content producer community that was in attendance at SIGGRAPH, but there’s obvious potential for all manner of FP32-loving AI and HPC applications.
Unveiling the P47 rack, AMD CEO Lisa Su recalled the breaking of the original petaflops barrier by the IBM Roadrunner supercomputer in 2008, a feat that required 6,480 dual-core Opteron CPUs and 12,960 Sony CELL BE co-processing units. Sure once you normalize the flops math, Roadrunner still has the performance edge by about a 3x factor but it also filled up 700 racks and used 2.35 MW of electrical power.
Inventec and its primary distributor AMAX are targeting Project 47 system availability for Q4 of this year. Pricing has not yet been announced.