July 21, 2020 — Barcelona Supercomputing Center (BSC), under the Intel-BSC Exascale Lab, and in collaboration with the EPEEC project, led by BSC, is heading the development of new software tools and expanding the software ecosystem for 2nd Generation Intel Xeon Scalable processors and Intel Optane persistent memory (Intel Optane PMem). This work is helping accelerate High Performance Computing (HPC) applications using heterogeneous memory architectures.
The BSC researcher Antonio Peña is in charge of this research to explore how to accelerate large HPC workloads by leveraging heterogeneous memory systems. With Intel Optane PMem and 2nd Generation Intel Xeon Scalable processors, he is driving breakthrough architectures that enable high-performance workloads with large datasets on HPC clusters using less power than DRAM.
“Right now, many HPC applications are constrained by the amount of DRAM in the nodes and cluster,” Peña explained. “They need more and more memory but adding larger and more DIMMs with the current technology is not feasible due to the power constraints on the overall system.”
“We’re trying to reduce server power while accelerating applications by using Intel Optane PMem and intelligently managing where the data is located and its movements,” Peña said. “We can take advantage of the large memory sizes that the new technology offers and put more data close to the processor using considerably less energy. There is a slightly longer latency than DRAM, but we don’t have to pay for the penalty of even more latency going to other storage technologies.”
Innovative Data Profiling and Memory Allocation Tools for Intelligent Data Management
To enable his approach with heterogeneous memories, Peña and his team have created several software tools using Extrae, a general-purpose profiler developed by BSC, Intel vTuneTM profiler, and Extended Valgrind for Object Differentiated Profiling (EVOP), among others. EVOP was first developed by Peña at ANL and is now maintained at BSC. Their tools first perform what Peña calls data-oriented profiling by running the profiling tools while the application executes normally. The tools analyze the demand and latencies for different objects and create a large file listing all data accesses.
“Knowing how each data object is accessed during execution helps us decide in the optimization step where those have to be allocated in the different memories,” Peña described. “In a simplified view, we associate metrics with the different data objects. Then we count the number of accesses or the number of last level cache misses for each object. From this, we can apply different algorithms for memory allocations to maximize the performance.”
Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS) is the national supercomputing centre in Spain. The center is specialised in high performance computing (HPC) and manage MareNostrum, one of the most powerful supercomputers in Europe, located in the Torre Girona chapel. BSC is involved in a number of projects to design and develop energy efficient and high performance chips, based on open architectures like RISC-V, for use within future exascale supercomputers and other high performance domains. The centre leads the pillar of the European Processor Project (EPI), creating a high performance accelerator based on RISC-V. More information: www.bsc.es
Source: Barcelona Supercomputing Center