Computer scientists from the Center for Computational Research, State University of New York (SUNY), University at Buffalo have examined the effect of Meltdown and Spectre security updates on the performance of popular HPC applications and benchmarks and are sharing their results in a paper, available on arXiv.org.
Their method was to use the application kernel module of the XD Metrics on Demand (XDMoD) tool to run tests before and after the installation of the vulnerability patches. They recorded the performance difference for the following applications and benchmarks: NWChem, NAMD, the HPC Challenge Benchmark suite (HPCC) [which includes the memory bandwidth micro-benchmark STREAM and the NASA parallel benchmarks (NPB)], IOR, MDTest and interconnect/MPI benchmarks (IMB).
Most of the application kernels were executed on one or two nodes (8 and 16 cores respectively) of a development cluster at the Center for Computational Research. Each node has two Intel L5520 CPUs (Nehalem EP) connected by QDR Mellanox InfiniBand, and can access 3 PB IBM of shared GPFS storage system. The operating system is CentOS Linux release 7.4.1708.
The worst case performance hit went as high as 54 percent for select functions (e.g., MPI random access, memory copying and file metadata operations), while real-world applications showed a 2-3 percent decrease in performance for single node jobs and a 5-11 performance decrease for parallel two-node jobs. The authors indicate there may be a way to recoup some of this loss via compiler and MPI libraries.
Also notable, Fourier transformation (FFT), matrix multiplication and matrix
transposition get slower, 6.4 percent, 2 percent and 10 percent slower (on two nodes) respectively.
The findings of the SUNY team align with those of Red Hat, which earlier this month released the results from benchmark tests it conducted specifically to measure the impact of the kernel patches. Red Hat found that CPU-intensive HPC workloads suffered only a 2-5 percent hit “because jobs run mostly in user space and are scheduled using CPU-pinning or NUMA control.” In comparison, database analytics were found to take a modest 3-7 percent hit and OLTP database workloads suffered the most (8-19 percent degradation).
The SUNY researchers have plans to conduct additional testing “with a larger number of nodes and for more application kernels” once the updates are applied to their production system.
The XD Metrics on Demand (XDMoD) tool employed for the testing was originally developed to provide independent audit capability for the XSEDE program. It was later open-sourced and is now used widely across research and commercial HPC sites. The tool includes an application kernel performance monitoring module that “allows automatic performance monitoring of HPC resources through the periodic execution of application kernels, which are based on benchmarks or real-world applications implemented with sensible input parameters.”
The paper was authored by Nikolay A. Simakov, Martins D. Innus, Matthew D. Jones, Joseph P. White, Steven M. Gallo, Robert L. DeLeon and Thomas R. Furlani. It is available on arxiv.org.