Visit additional Tabor Communication Publications
August 18, 2011
Early in August the U.S. Department of Energy’s Office of Science and Office of Advanced Scientific Computing Research (ASCR) held a workshop called “Exascale and Beyond: Gaps in Research, Gaps in our Thinking” that brought together luminaries from the world of high performance computing to discuss research and practical challenges at exascale.
Given the scope of the short event’s series of discussions, we wanted to highlight a few noteworthy presentations to lend a view into how researchers perceive the coming challenges of exascale computing. While all of the speakers addressed known challenges of exascale computing, most brought their own research and practical experiences from large HPC centers to bear.
For instance, MIT professor of Electrical Engineering and Computer Science and director of the university’s Computer Science and Artificial Intelligence Laboratory (CSAIL) Anant Agarwal asked attendees if the current approach to exascale computing is radical enough.
Agarwal focused on the targets set by the Ubiquitous High Performance Computing (UHPC) set forth by DARPA, claiming that the debates have centered on increasing performance while reducing energy but that the challenges are far greater than mere energy. Agarwal argues that the other great hurdles lie in programmability and resiliency—and that to arrive at solutions for these problems, “disruptive research” is required. This kind of research will focus on the fact that getting two out the three big problems (performance, efficiency and programmability) will be relatively “easy” getting all three right presents significant challenges.
NVIDIA’s Bill Dally echoed some of Agarwal’s assertions in his presentation, “Power and Programmability: The Challenges of Exascale Computing” in which he proclaimed the end to historic levels of scaling, citing challenges related to power and code.
In his presentation, Dally claimed that it’s not about the FLOPs any longer, it’s about data movement. And further, it’s not simply a matter of power efficiency as we traditionally think about, it’s about locality.
Dally argues that “algorithms should be designed to perform more work per unit data movement” and that “programming systems should further optimize this data movement.” He went to cite the fact that architectures need to facilitate data movement by providing an exposed hierarchy and efficient communication.
In some ways, Dally’s presentation offered some of the “disruptive” ideas Agarwal cited that can radicalize ways of thinking about exascale limitations. Dally’s focus on locality (optimizing data movement versus focusing on the FLOPs; optimizing subdivision and fetching paradigms; offering an exposed storage hierarchy with more efficient communication and bulk transfer) is a break from the norm in terms of offering solutions for exascale challenges—and one that generated rich fodder for the presentation, which you can find in detail here.
Locality was a hot-button issue at this workshop, drawing a detailed, solution-rich presentation from Allan Snavely, associate director of the San Diego Supercomputer Center and adjunct professor in UCSD’s Department of Computer Science and Engineering.
In his presentation, “Whose Job is it to Find Locality?” Snavely dug deeper into some of the initial concepts Dally put forth. Snavely recognized that people seem to be waiting on “magic” compilers and programming languages to come along, for application programmers to suddenly be rendered flawless, or for machines to simply let users choose how to burn up resources.
He claims that the attitude of “LINPACK has lots of locality, so what’s the problem” is the root of a problem as everyone waits for answers to locality problems to fall out of the sky. In his presentation, Snavely proposes a few solutions, including a new approach to the software stack found here.
In addition to moving the conversation out of the theoretical and into the realm of actual solutions, Snavely discussed how his UCSD team is currently developing tools and methodologies that can identify location in applications to reduce the processor frequency for effective power savings and further, working on tools that can automate the process of inserting “frequency throttling calls” into large-scale applications.
The Hunt for Perfect Solutions
The unstated theme of the workshop undoubtedly was focused on out-of-the-box thinking for exascale challenges that provide the “radical” approaches Agarwal and others touched upon. Still, it was useful to pick up on the practical and theoretical issues with presentations from notables, including Thomas Sterling, who discussed Exascale Execution Models, John Shalf who highlighted the past, present and future of exascale computing, and Dave Resnick who looked at the missing links that stand in the way before exascale computing becomes a reality.
Others provided real-world perspectives, including IBM Research Senior Manager, Mootaz Elnozahy, in his presentation on lessons learned from HPCS/PERCS project.
Aside from presentations addressing some of the research and practical challenges of exascale computing, others, including Keren Bergman and Norman Jouppi addressed the future of photonics in the era of exascale while others, including Dave Resnick, focused on memory (in this case Micron’s new memory component, the Hybrid Memory Cube).
With all presentation topics considered collectively, there is evidence that we are moving beyond simple questions about simple power or performance issues and into the realm of disruptive approaches to programming and optimizing for exascale systems. Detailed slides and other materials to further the conversation can be found here.
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
May 08, 2013 |
For engineers looking to leverage high-performance computing, the accessibility of a cloud-based approach is a powerful draw, but there are costs that may not be readily apparent.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.