Since 1986 - Covering the Fastest Computers in the World and the People Who Run Them

Language Flags
October 14, 2013

Reprising the 13 Dwarfs of OpenCL

Tiffany Trader
13_computational_dwarfs_300x

If you’re considering using GPUs to speedup compute-intensive applications, it’s important to understand which algorithms work best with GPUs and other vector-processors. As HPC expert and founder of StreamComputing Vincent Hindriksen puts it, you want to know “what kind of algorithms are faster when using accelerators and OpenCL.”

Professor Wu Feng and a team of researchers at Virginia Tech elucidate this topic with their 2011 manuscript, titled “The 13 (computational) dwarfs of OpenCL.” The authors of that seminal paper explain that “each dwarf captures a pattern of computation and communication that is common to a class of important applications.”

This paper became an important resource for StreamComputing, and it remains a good starting point when considering the benefits of GPUs and OpenCL.

Hindriksen explains that the 13 dwarfs framework was inspired by Phil Colella, who identified seven numerical methods important for science and engineering, aka seven dwarfs. To this list, Feng and his team added six more application areas well-suited to GPUs and other vector-accelerated processors. That’s how the 13 “dwarfs” came to be. To place this in literary context, there are seven dwarfs in “Snow White,” and 13 in Tolkien’s “The Hobbit.”

Hindriksen continues with his overview of the computational dwarfs of OpenCL. “Each part has a description of the ‘computational dwarf,’ examples of application areas and some words from the OpenCL perspective,” he writes. “It is not intended to be complete, but to be a starting point. You will notice overlap, as some algorithms have aspects of two or more – this also implies some problems have more solutions.”

The 13 computational dwarfs are as follows:

Dense Linear Algebra

Sparse Linear Algebra

Spectral Methods

N-Body Methods

Structured Grids

Unstructured Grids

Map-Reduce & Monte Carlo

Combinational Logic

Graph Traversal

Dynamic Programming

Backtracking

Probabilistic Graphical Models

Finite State Machines

Hindriksen leaves the reader with a note on the importance of categorization. “Understanding which type of applications perform well, makes it easier to decide when to use GPUs and other accelerators or when to use CPUs,” he writes. If the candidate algorithm does not map to one of the 13 camps, then there’s a good chance it’s not suitable for OpenCL.

“[Furthermore] the right hardware can be better selected when you know the right job category,” notes Hindriksen. “So don’t just buy a Tesla when you want to start with GPGPU, as others have done. For example combinational logic is much faster on AMD GPUs. Not all of above algorithms work best on a GPU by definition – OpenCL on a CPU is a good choice when memory-bandwidth is not the bottleneck.”

StreamComputing is an international software development company based in the Netherlands that specializes in speeding up software using the power of GPGPU computing.