Without the possibility of revolutionary new shifts in large-scale computing on the hardware horizon, an increasing amount of attention is being turned to various modes of optimization. These efforts extend beyond tweaking codes for HPC systems into the realm of using detailed, holistic performance data to maximize performance and efficiency across existing architectures.
According to a group of researchers, including Dr. Michael Kluge, from the Center for Information Services and High Performance Computing at the Technische Universitat in Dresden, holistic performance analysis for data-intensive and extreme-scale systems requires gathering data from across the hardware, software and storage layers and meshing it into a comprehensive whole. This will require its own use of “big data” technologies to accomplish, but the end result would ultimately provide users (specifically application end users and developers as well as system administrators) with detailed information about optimizations that can be learned and built into a self-tuning feedback loop.
With so many systems tackling “big data” problems at the application level, adding another layer of data-intensive discovery might sound burdensome. However, the new capabilities to fine-tune performance according to the big picture view of a system’s requirements and capabilities might pay dividends in the long run, the authors contend.
This is a significant challenge, but one that must be met as systems grow in terms of core counts and increasing volumes of data. As the researchers explain, “the enormous number of computing and data units as well as the complexity of their integration into a single machine provides an intellectual challenge…Hardware hierarchies with different performance specifications, sophisticated network topologies, complex operational software, various levels of (parallel) execution threads, optimization strategies for components that are often unknown and many other implications hinder an efficient use [of the system].”
While there are already tools in place to analyze performance, these are based on models that target specific errors or flaws and attempt to help users find a resolution. The limitation here is that “existing systems are restricted either in the detail level of the data collected or in the number of components they get data from. The performance of such a performance analysis is one of the limiting factors by itself,” the authors argue.
Being able to arrive at a truly holistic view of performance requires the capability to continuously collect performance data from the many system components including:
- “Classical” performance data on the application level
- From the entire computing chain (single core, multicore, cache levels, etc.)
- From the I/O hierarchy (disks, servers, data and metadata management, etc.)
- From network components
- From the software layer (versions, parameters, compilation parameters, etc.)
Despite the overhead in development, system use and sheer manpower, the effects of a total view of full system performance will be critical as machines continue to add core and storage headcount. As the authors describe, the ultimate outcome would be a collection of performance patterns that could eventually allow auto-tuning on the fly to match system characteristics.