Nov. 11, 2020 — The Oak Ridge Leadership Computing Facility (OLCF) has upgraded its Arm-based test system, Wombat, equipping it with the HPE Apollo 80 system that uses the same processor architecture found in the world’s fastest supercomputer, Fugaku, located at the RIKEN Center for Computational Science.
Wombat’s new Fujitsu A64FX processors, developed by Fujitsu, are the first to use the Arm Scalable Vector Extension (SVE) technology, a unique instruction set that offers flexibility for processing vectors—strings of numbers treated as a coherent unit in memory. SVE can process vectors as short as 128 bits and as long as 2,048 bits without forcing users to recompile their scientific codes for different vector lengths.
“The Fujitsu A64FX is the first processor available to us that actually implements these new SVE instructions,” said Ross Miller, systems integration programmer in the Technology Integration Group at the National Center for Computational Sciences. “With SVE, you can take a code and run it on a processor that has 128-bit vectors or 512-bit vectors, and it will still run and use all the available hardware.”
The OLCF’s Wombat test bed now features a total of 16 Fujitsu A64FX processors in 2 HPE Apollo 80 chassis, four compute nodes that have Marvell ThunderX2 processors plus NVIDIA Quadro GV100 GPUs, and four compute nodes that have ThunderX2 processors but no GPUs. Miller believes many users are eager to compare the performance of the Fujitsu processors with the GV100 GPUs. Comparing architectures can give programmers an understanding of which types of applications and programming models are best suited for certain processors.
Miller said people might have an easier time getting the full performance out of the Fujitsu processors because the GPUs are so highly specialized that it’s difficult to extract the maximum performance out of them.
“In some ways, these two are very similar. They both have 32 gigabytes of high-bandwidth memory, for example.” Miller said. “One of the advantages of Wombat, in general, is that it’s got both of these architectures right here in the same place, and it will be very easy to do comparisons.”
In a paper to be presented at the Supercomputing 2020 Conference, a team described how it ported the DCA++ application, which is used to study the physics of strongly correlated electron systems, to multiple platforms—including Wombat—earlier this year. Now, that same team will compare the code’s performance on Wombat’s ThunderX2 processors and Volta GV100 GPUs with its performance on the Fujitsu processors and also on the OLCF’s Summit system, the nation’s fastest supercomputer. The OLCF is a US Department of Energy (DOE) Office of Science User Facility located at DOE’s Oak Ridge National Laboratory (ORNL).
“My main goal is to understand the performance portability of different applications,” said Oscar Hernandez, senior staff member at the National Center for Computational Sciences who is working on the team. “We really want to understand how the compilers are efficiently generating code for SVE, and we are in the very early stages of doing performance analyses.”
According to Hernandez, other codes are in the process of being ported to the new processors as well.
“We are interested in whether we can write a single code to get performance from the Fujitsu processors as well as an architecture that includes the NVIDIA Quadro GV100 GPUs using portable parallel programming models like OpenMP, OpenACC, Kokkos, and using scientific libraries,” Hernandez said. “We still need to try different applications, compilers, and libraries that specifically target these new processors.”
UT-Battelle LLC manages Oak Ridge National Laboratory for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, visit https://energy.gov/science.
Source: RACHEL HARKEN, OLCF