At SC20, Intel announced that it is making its Xe-HP high performance discrete GPUs available to early access developers. Notably, the new chips have been deployed at Argonne National Laboratory, and will serve as a transitional development vehicle for the future (2022) Aurora supercomputer, subbing in for the delayed Intel Xe-HPC (“Ponte Vecchio”) GPUs that are the computational backbone of the system.
The Xe-HP-based development platforms are being used by the Argonne Leadership Computing Facility (ALCF) as part of the Aurora Early Science Program and the Exascale Computing Project, which are tasked with ensuring applications, libraries and infrastructure are exascale-ready.
“Our collaboration with Intel in the area of cross-architecture code development is benefiting many of our developer teams,” said Susan Coghlan, project director for Aurora at the ALCF. “This co-design approach has led to the software stack’s quick maturation to production quality for execution on Aurora.”
Aurora is being built via a partnership between Intel, Hewlett Packard Enterprise and Argonne National Laboratory. The exascale-class system implements HPE’s Cray EX supercomputer architecture with Slingshot networking, a future generation of Intel Optane persistent memory and Intel oneAPI software.
Aurora’s node design features two future-generation 10nm++ “Sapphire Rapids” CPUs (with enhanced SuperFin technology) and six Xe-HPC “Ponte Vecchio” datacenter GPUs.
The schedule and exact design of the Xe-HPC GPUs is still unclear, but after announcing a one-year delay due to a defect mode in the 7nm process, Intel has now committed to deploying Aurora in 2022. A previous incarnation of Aurora was originally to be deployed in 2018, but that project was recast in 2017 as the nation’s first exascale machine with a 2021 target.
At Intel Architecture Day in August, the company said the Xe-HPC GPU would be manufactured using its 10nm SuperFin for the base tile, with 10nm Enhanced SuperFin for the Rambo Cache tile, but the company has not disclosed the process node for the compute tile, which will use either an Intel Next Gen process or go through an external fab.
The Xe-HP GPU has been implemented in 1- 2- and 4-tile designs manufactured using 10nm Enhanced Superfin; with the 4-tile variant providing over 40 teraflops of FP32 performance (according to Intel). The 2-tile variant that Intel deployed to Argonne provides about half that performance (~21 teraflops of FP32).
In an Intel keynote presentation delivered at SC20, Trish Damkroger, general manager of Intel’s HPC group, said the collaboration between Intel and Argonne “will focus on creating next-generation semiconductor technologies, manufacturing processes, advanced system design and software enablement, including future silicon development, future architectures for high-performance computing and AI, and software ecosystem enablement for exascale computing.
“Aurora will give scientists and researchers an unprecedented set of tools and applications that will enable breakthrough advancements in a wide variety of areas that will benefit all of us, including medicine, weather modeling, climate change and material science,” said said.
The Xe-HP development platforms are supporting co-design, testing and validation of several exascale applications, including:
The EXAALT project, which enables molecular dynamics at exascale for fusion and fission energy problems.
The QMCPACK project, which is developing Quantum Monte Carlo algorithms at exascale to improve predictions concerning complex materials.
The GAMESS project, which is developing ab-initio fragmentation methods to more efficiently tackle challenges in computational chemistry, such as heterogeneous catalysis problems.
The ExaSMR project, which is developing high-fidelity modeling capabilities at exascale for complex physical phenomena occurring within operating nuclear reactors to ultimately improve their design.
The HACC project, which is developing extreme-scale cosmological simulations at exascale that will allow scientists to simultaneously analyze observational data from state-of-the-art telescopes to test different theories.
Related coverage: Intel Debuts oneAPI Gold and Provides More Details on GPU Roadmap