Compilers and More: OpenACC to OpenMP (and back again)

Jun 29, 2016 |

In the last year or so, I’ve had several academic researchers ask me whether I thought it was a good idea for them to develop a tool to automatically convert OpenACC programs to OpenMP 4 and vice versa. In each case, the motivation was that some systems had OpenMP 4 compilers (x86 plus Intel Xeon Phi Knights Corner) and others had OpenACC (x86 plus NVIDIA GPU or AMD GPU), and someone wanting to run a program across both would need two slightly different programs. In each case, the proposed research sounded like a more-or-less mechanical translation process, something more like a sophisticated awk script, and that’s doomed from the start. I will explain below in more detail how I came to this conclusion.

A Comparison of Heterogeneous and Manycore Programming Models

Mar 2, 2015 |

The high performance computing (HPC) community is heading toward the era of exascale machines, expected to exhibit an unprecedented level of complexity and size. The community agrees that the biggest challenges to future application performance lie with efficient node-level execution that can use all the resources in the node. These nodes might be comprised of Read more…

New Degrees of Parallelism, Old Programming Planes

Aug 28, 2014 |

Exploiting the capabilities of HPC hardware is now more a matter of pushing into deeper levels of parallelism versus adding more cores or overclocking. What this means is that the time is right for a revolution in programming. The question is whether that revolution should be one that torches the landscape or that handles things Read more…

Parallel Programming with OpenMP

Jul 31, 2014 |

One of the most important tools in the HPC programmer’s toolbox is OpenMP, a standard for expressing shared memory parallelism that was published in 1997. The current release, version 4.0, came out last November. In a recent video, Oracle’s OpenMP committee representative Nawal Copty explores some of the tool’s features and common pitfalls. Copty explains Read more…

A Data Locality Cure for Irregular Applications

Feb 18, 2014 |

Data locality plays a critical role in energy-efficiency and performance in parallel programs. For data-parallel algorithms where locality is abundant, it is a relatively straightforward task to map and optimize for architectures with user-programmable local caches.  However, for irregular algorithms such as Breadth First Search (BFS), exploiting locality is a non-trivial task. Guang Gao, a Read more…

Unleashing The Potential of OpenMP via Bottleneck Analysis

Feb 13, 2014 |

To capitalize on the computational potential of parallel processors, programmers must identify bottlenecks that limit their application. These bottlenecks typically chain performance preventing an application from reaching its full potential. Performance analysis typically provides the data and insight necessary to identify opportunities for program optimization. Researchers in the Inderprastha Engineering College identify general bottlenecks for Read more…

The Week in HPC Research

Mar 21, 2013 |

The top research stories of the week include an evaluation of sparse matrix multiplication performance on Xeon Phi versus four other architectures; a survey of HPC energy efficiency; performance modeling of OpenMP, MPI and hybrid scientific applications using weak scaling; an exploration of anywhere, anytime cluster monitoring; and a framework for data-intensive cloud storage.

OpenMP Takes To Accelerated Computing

Nov 27, 2012 |

OpenMP, the popular parallel programming standard for high performance computing, is about to come out with a new version incorporating a number of enhancements, the most significant one being support for HPC accelerators. Version 4.0 will include the functionality that was implemented in OpenACC, the accelerator API that splintered off from the OpenMP work, as well as offer additional support beyond that. The new standard is expected to become the the law of the land sometime in early 2013.

OpenACC Starts to Gather Developer Mindshare

May 17, 2012 |

PGI, Cray, and CAPS enterprise are moving quickly to get their new OpenACC-supported compilers into the hands of GPGPU developers. At NVIDIA's GPU Technology Conference this week, there was plenty of discussion around the new HPC accelerator framework, and all three OpenACC compiler makers, as well as NVIDIA, were talking up the technology.

NVIDIA Pokes Holes in Intel’s Manycore Story

Apr 3, 2012 |

As NVIDIA's upcoming Kepler-grade Tesla GPU prepares to do battle with Intel's Knight Corner, the companies are busy formulating their respective HPC accelerator stories. While NVIDIA has enjoyed the advantage of actually having products in the field to talk about, Intel has managed to capture the attention of some fence-sitters with assurances of high programmability, simple recompiles, and transparent scalability for its Many Integrated Core (MIC) coprocessors. But according to NVIDIA's Steve Scott, such promises ignore certain hard truths about how accelerator-based computing really works.