Sept. 16, 2022 — Visualization and analysis on any high-performance computing system faces a last-mile problem in which the potential of the resource can only be realized when people have the tools to examine and interpret the data created on them. The data and visualization portfolio in the US Department of Energy’s Exascale Computing Project (ECP) recognizes that storage alone cannot fix the last-mile problem because scientists can now compute their models faster than the hardware can store the resulting data. This compute vs. storage dichotomy will diverge even further on the new exascale systems because concurrency will increase by roughly 5–6 orders of magnitude, yet system memory and I/O bandwidth will only grow by 1–2 orders of magnitude.[i] To address this challenge, scientists can leverage a variety of new software tools that support an ecosystem of data-centric programming models, compression, and innovative big-data approaches such as in-situ visualization. In-situ analysis and visualization is an important new capability for big-data simulations that enables access to simulation data on the supercomputer while the simulation is still running.
Such a visualization and analysis ecosystem must be expressive enough to specify what data will be kept, flexible enough to enable future analysis, and convenient enough that scientists can and will use it.[ii] The output data format must be relatively general because the scientist must, out of necessity, specify what images and data will be kept prior to submitting the run on the supercomputer. This precludes simply storing in-situ rendered images because the scientist may need to, for example, visualize the data from unanticipated points of view, examine occluded features, or change the color map to highlight interesting phenomena. Along with powerful analytic capabilities, the software ecosystem must also provide general mechanisms to identify, track, and trigger I/O operations only when events of interest occur.
Cinema is part of a software ecosystem that meets these ECP mission requirements. David Honegger Rogers, PI and team lead for the Data Science at Scale team at Los Alamos National Laboratory (LANL), notes the strong collaboration with the ECP ALPINE project that provides the in-situ infrastructure leveraged by Cinema, “Fundamentally, the Cinema project provides images that people understand combined with the ALPINE project infrastructure and analytic capabilities they need. Our tools focus on providing scalable analytics and visualization software that effectively supports scientific discovery and the understanding of massive data.”
The December 3, 2020, episode of ECP’s Let’s Talk Exascale podcast, “Supporting Scientific Discovery and Data Analysis in the Exascale Era,” featured the Data and Visualization portfolio lead Jim Ahrens discussing ECP visualization tools, including Cinema and ALPINE, with host Scott Gibson.
To read the rest of Rob Farber’s article for ECP, visit this link.
Source: Rob Farber, contributing writer, Exascale Computing Project