Reprising the 13 Dwarfs of OpenCL
If you’re considering using GPUs to speedup compute-intensive applications, it’s important to understand which algorithms work best with GPUs and other vector-processors. As HPC expert and founder of StreamComputing Vincent Hindriksen puts it, you want to know “what kind of algorithms are faster when using accelerators and OpenCL.”
Professor Wu Feng and a team of researchers at Virginia Tech elucidate this topic with their 2011 manuscript, titled “The 13 (computational) dwarfs of OpenCL.” The authors of that seminal paper explain that “each dwarf captures a pattern of computation and communication that is common to a class of important applications.”
This paper became an important resource for StreamComputing, and it remains a good starting point when considering the benefits of GPUs and OpenCL.
Hindriksen explains that the 13 dwarfs framework was inspired by Phil Colella, who identified seven numerical methods important for science and engineering, aka seven dwarfs. To this list, Feng and his team added six more application areas well-suited to GPUs and other vector-accelerated processors. That’s how the 13 “dwarfs” came to be. To place this in literary context, there are seven dwarfs in “Snow White,” and 13 in Tolkien’s “The Hobbit.”
Hindriksen continues with his overview of the computational dwarfs of OpenCL. “Each part has a description of the ‘computational dwarf,’ examples of application areas and some words from the OpenCL perspective,” he writes. “It is not intended to be complete, but to be a starting point. You will notice overlap, as some algorithms have aspects of two or more – this also implies some problems have more solutions.”
The 13 computational dwarfs are as follows:
Dense Linear Algebra
Sparse Linear Algebra
Map-Reduce & Monte Carlo
Probabilistic Graphical Models
Finite State Machines
Hindriksen leaves the reader with a note on the importance of categorization. “Understanding which type of applications perform well, makes it easier to decide when to use GPUs and other accelerators or when to use CPUs,” he writes. If the candidate algorithm does not map to one of the 13 camps, then there’s a good chance it’s not suitable for OpenCL.
“[Furthermore] the right hardware can be better selected when you know the right job category,” notes Hindriksen. “So don’t just buy a Tesla when you want to start with GPGPU, as others have done. For example combinational logic is much faster on AMD GPUs. Not all of above algorithms work best on a GPU by definition – OpenCL on a CPU is a good choice when memory-bandwidth is not the bottleneck.”
StreamComputing is an international software development company based in the Netherlands that specializes in speeding up software using the power of GPGPU computing.