One of the most pressing issues faced by the HPC community is how to go about attracting and training the next generation of HPC users. The staff at Argonne National Laboratory is tackling this challenge head on by holding an intensive summer school in extreme-scale computing. One of the highlights of the 2013 summer program was a class taught by Pete Beckman: An Introduction to Parallel Supercomputers.
Argonne has a history of supporting these Summer Institutes that goes back to the late 1980s. The attendees, a select group of mainly PhD students and postdocs, are fortunate to be able to not only receive training in the use of supercomputing systems for large-scale science and engineering research, they get to rub elbows with some of the brightest minds in HPC.
In this 30-minute presentation, Professor Beckman provides a short overview of the course and shares with the students what he thinks is really important in the world of HPC. Starting with an overview of Argonne and Fermi, and the DOE institutions’ hallowed histories, Beckman explains how Argonne has emphasized parallel computing and teaching parallel architectures long before it was in wide use. Back in 1983, Paul Messina helped found the first math and computer science division at the lab.
Messina is now the Director of Science at the Argonne Leadership Computing Facility (ALCF), which was established in 2006 in recognition of the role that parallel computers would play in the future of scientific computing.
“You are going to see the architectural changes that are happening,” Beckman told the students, “and these are not small. There was a period for almost a decade where things were very stable in an area of computing, and right now we are in a big change again. What happened in 1984 is about to happen again. Everything in software and hardware is changing and you have to adapt to it.”
Beckman puts up a slide from the mid-90s with various parallel programming architectures and machines with many that have since gone under or changed hands, names like BBN Butterfly, CM-2, Kendal Square Research, MasPar, and others.
Beckman says it is likely that there will be another one of these high-churn periods, and this is evidenced so far with technologies like GPGPU, ARM and others.
When beginning a code project, Beckman recommends starting with the view that this code will last for five, ten, even fifteen years. Considering this long-term investment, there must be a way to preserve this investment. To that end, Beckman provides a list of investment recommendations for budding HPC programmers that will enable them to spend more time doing the cool science. His number one point of advise is to be aware of other people’s libraries. From there he explores the benefits of encapsulation (parallelism, messaging and I/O), embedded capabilities (debugging, performance monitoring, correctness detection and resilience), the two workflow views (the science side and the programming side), automation, and community (web, tutorial, email, bug tracking, etc.).
As he wraps up the class, Beckman explores some of the major trends in HPC programming, those that are ramping up and those that are ramping down. See slide below.
The Argonne Training Program for Extreme-Scale Computing is funded through the DOE’s Office of Science from 2013 to 2016. More information is available at http://extremecomputingtraining.anl.gov/. The next session will be offered August 3 – August 15, 2014.