Roofline, a software performance model first developed (~2005) by Sam Williams while he was a Ph.D. student at the University of California, Berkeley, is now being pressed into service at the National Energy Research Scientific Computing Center (NERSC) to optimize code for use on manycore systems, notably Intel Knights Landing (KNL).
“The idea of Roofline is two-fold,” said Lenny Oliker, a computer senior scientist in Lawrence Berkeley National Lab’s Computational Research Division who worked closely with Williams, now a staff scientist in CRD’s Performance Algorithms Research Group, to refine and expand Roofline’s capabilities. “First, we need to understand the underlying hardware architecture (of the supercomputer), its capabilities and the performance of real codes running on it. Then we want to characterize actual applications and graph them onto the Roofline chart.”
In practice, the Roofline model is a graph with an x and y axis where the x axis is the arithmetic intensity—a measure of flops per byte, explained Jack Deslippe, acting group lead for NERSC’s Application Performance Group who is working with Williams and other Berkeley Lab colleagues to extend the model and its applications. “In computer code what this means is how many floating point operations (FLOPs) do you do for every byte of data that you have to bring in from memory,” Deslippe said. “What the Roofline curve tells you is what performance you can expect from the system given the characteristics of your application or a subroutine of the application.”
An account of NERSC’s expanded use of Roofline – formally known as the Empirical Roofline Toolkit (ERT) – is on the NERSC web site (Roofline Model Boosts Manycore Code Optimization Efforts).
While the Roofline model has been used for a number of years to characterize supercomputing systems and architectures, over the past year it has been expanded to both visualize and guide application optimization, and new tools have been developed to support this, according to Deslippe. As part of this effort, NERSC’s Doug Doerfler has extended and applied the technology to the Knights Landing processors in the Cori KNL system.
“We are using this model to frame the conversation with users about where their application stands,” Deslippe said. “It’s a good way to communicate with users about what they need to work on with a given application or subroutine. It takes a little of the mystery out of code optimization.”
Over the past year, the Roofline team has been introducing NERSC users, including those involved in the NERSC Exascale Science Applications Program (NESAP), to the Roofline model to help them gauge performance improvements. For example, Tuomas Koskela, a NERSC postdoc who joined the center in 2016 to work on XGC1 (a fusion particle-in-cell code) as part of a NESAP project, has been using Roofline to improve the code’s performance on Cori.
“We started talking about using Roofline last spring,” Koskela said. “It was interesting to me because I had a problem with my code that we didn’t understand why it wasn’t getting very good performance.” After using Roofline via Intel Advisor to optimize the performance of kernels of the XGC1 code for the KNL architecture, Koskela was able to dramatically improve the code’s performance on Cori.