By now, most HPCers and the surrounding community are aware that data movement poses one of the most fundamental challenges to post-petascale computing. Around the world exascale-directed projects are attempting to maximize system speeds while minimizing energy costs. In the US, for example, exascale targets have peak performance increasing by three orders of magnitude while system power merely doubles. Accomplishing this balancing feat means addressing the most expensive operation: data movement. So says James Ahrens of Los Alamos National Laboratory, who has published a paper on the subject.
The two pager – Increasing Scientific Data Insights About Exascale Class Simulations Under Power and Storage Constraints – points out that storage constraints are also affected by power costs. “Future storage technology projections suggest that the gap between both capacity/bandwidth and FLOPS will widen as we move toward exascale,” observes Ahrens. If this is the case then the storage system of an exascale supercomputer would be smaller and slower compared to today’s systems for a similar investment.
The power and storage constraints are leading the community to reevaluate the scientific workflow such that the focus is changing from post-processing to in situ analysis. The traditional sequential approach where visualization and analysis are carried out post-processing and full checkpoints are saved for later restarts will not be viable going forward. There is an emerging consensus, writes Ahrens, that significantly more visualization and analysis should take place in situ, during the simulation run while the data is resident in memory.
Ahrens suggests three guidelines to support the shift to greater in situ analysis:
- Sampling and Uncertainty Quantification of Simulation Data are Needed
- Deliberate Analysis Choices Are Necessary
- Data Reduction and Prioritization Is Required
Regarding the first bullet point, Ahrens points out that in situ analysis is really a form of sampling, where the simulation scientist no longer has the luxury of sampling fully on “the spatial, multivariate and variable type domains at the expense of sampling fully in the temporal domain.” So the question becomes how to sample from each domain so that the overall analysis quantity is maintained or increased.
Writes Ahrens: “The quality of their results can be measured through combined in situ sampling/uncertainty quantification techniques. For example, in our work, we statistically sample using a stratified random sampling approach on the MC^3 cosmological particle simulation. We store these samples in a level-of-detail organization for later interactive progressive visualization and feature analysis. By sampling during the simulation, we are able to analyze the entire particle population to record full population statistics and quantify sample error.”
Ahrens also addresses how to move away from the “save everything” mindset and appreciate that this is just one choice among many. When power and storage are constrained, it’s crucial to make deliberate analysis decisions before the simulation starts.
In the third bullet point, Ahrens points out that there are other ways to reduce data other than via statistical sampling. “Visualization operations and feature extraction algorithms can also be considered a type of sampling strategy,” he notes. A greedy algorithm will save the highest priority information as the simulation progresses, overwriting lower priority output.