Another Look at GPGPU

By Michael Feldman

April 13, 2007

The interest in the general-purpose computation on GPUs (GPGPU) is at an all-time high. If you've been reading this publication for the last several months, you've no doubt noticed we've devoted quite a bit of coverage to this topic since the middle of 2006. The event that triggered this upsurge in interest was AMD's acquisition of ATI in July of 2006, and the subsequent announcement of a product strategy that would bring graphics processors into the mainstream of general-purpose computing. In the fall of 2006 NVIDIA revealed its own GPGPU strategy with its CUDA initiative.

The movement of GPU towards mainstream computing has been taking place for some time. Because of the broader requirements from visualization and game software in recent years, graphics processors are shifting toward a more general-purpose architecture; they're becoming more programmable and more CPU-like. Now, with both AMD and NVIDIA spinning a compelling tale of graphics processors as high performance parallel processing engines, the promise of cheap HPC never seemed closer. But not everyone is cheerleading.

Ars Technica's Jon Stokes is one of those who is keeping his pompoms at his side. In a recent article he wrote: “Anybody's GPU, whether it's from NVIDIA or AMD/ATI, is a big, hot, power-hungry, beast of a coprocessor that's designed to do one thing extremely well: real-time 3D rendering for games. In fact, we can be even more specific and call a GPU a “Microsoft DirectX toaster.” These same DirectX toasters also just happen to offer significant speedups vs. a regular microprocessor for certain types of data-parallel workloads that are important in HPC.”

Speaking of NVIDIA specifically, he adds: “They have a floor wax that happens to taste pretty good, so they're trying to use it to break into the food business by marketing it as a dessert topping.”

OK. So Stokes is obviously not a fan. He doesn't reject the notion of general-purpose computing on GPUs outright; he just thinks the proper place for the current crop of GPUs is on the motherboards of gaming enthusiasts, not in the sockets of HPC servers. He brings up some of the downsides of doing HPC with graphics processors, namely high power usage, programmer difficulty, vendor lock-in, and backward compatibility. (He doesn't even mention the current lack of 64-bit floating-point support.) Most of these factors point to the current immaturity of the GPGPU world.

But the same disadvantages existed in x86 designs before competition, standard software libraries and tools, and processor technology advancements made that architecture suitable for supercomputing. These disadvantages are well understood by both AMD and NVIDIA and they're working to address them.

On the other hand, GPUs do have to overcome a hurdle that the x86 never faced: its reputation as a specialized device for graphics processing. In this instance, the success of GPUs in the game market cuts both ways. The high-volume chip production that results from the huge demand by the game industry provides low prices, which offers an incentive to enter the HPC market. But the market pressure to make GPUs more targeted to visualization applications in some cases pushes the design away from general-purpose computing. This seems like a Catch-22 type of model.

Some of this uneasiness is misplaced. All processors, even general-purpose CPUs, devote silicon that targets certain types of applications, for example, the SSE instructions on x86 for (coincidentally) stream processing. Also, the GPU manufacturers will probably end up developing separate lines of GPGPU-oriented offerings which are variants of their core graphics devices for gamers. Finding the proper balance between specialized and general-purpose technology will be the key.

There is a continuum of coprocessing specialization from FPGAs, to GPUs and Cell processors, to floating point coprocessors, like ClearSpeed boards. As you go from FPGAs (least specialized) to FP coprocessors (most specialized), prices go up as a reflection of volume demand, but the difficulty of programming the devices decreases. Cell processors and GPUs are somewhere in the middle and may represent a sweet spot for HPC acceleration, offering high performance/price and relatively easy, at or least attainable, programmability.

The bigger problem for GPUs may be PR. AMD and NVIDIA are going to have to convince system manufactures and ISVs that graphics processors will be a mainstream technology. The hardest part will be developing a GPGPU software ecosystem around these devices. Game developers and HPC programmers live in different worlds. To get the HPC crowd interested you have to stop talking about pixel shaders and DirectX and start talking about stream computing.

This is where companies like PeakStream and RapidMind can help. Their software development platforms are designed to hide the GPU's 'gaminess' from the programmer. In fact, the software interfaces in these platforms are such that the developer need not be concerned with the underlying processor hardware. At a somewhat lower level, AMD's CTM (“Close To Metal”) open hardware interface and NVIDIA's C compiler CUDA technology have been introduced to offer programmers high-level access to the graphics processors' capabilities. We're just at the beginning of the software side of GPGPU, so it's too early to say what the best programming model is. But everyone agrees that raising the level of software abstraction will help to drive GPUs into the mainstream.

As far as the suitability of the graphics hardware for HPC servers, the biggest problem will be power usage. Since the gamers were never that concerned about an extra 100 watts or so in their machines, energy-efficiency was never much of a design issue. But if you want to start putting high-powered GPUs in already overheated server nodes, the devices are going to have to run a lot cooler.

Ars Technica's Stokes has something to say on this topic as well. In an article published this week, he posits that GPUs will have to become less energy hoggish to penetrate into the HPC market. He believes that getting the devices onto 65 nm process technology may be a good way to do start. In general, GPUs are a process technology cycle behind CPUs; the current NVIDIA G80 devices are at 90 nm. The GPGPU trend may create the incentive to bring graphics processors into the same technology cycle as their CPU counterparts. Certainly as AMD starts creating the CPU/GPU Fusion hybrid processors, that process synchronization will have to occur. If Intel gets into the GPU game, they are almost sure to press their advantage in process technology for their graphics devices. This is just another example of how GPUs are becoming more CPU-like.

But it's not just that GPUs are becoming more like CPUs, it's that the applications are becoming more game-like, that is, more data parallel in nature. Seismic modeling, financial options pricing and computational biology are all examples of the kinds of workloads that can be greatly accelerated with graphics processors today. The next generation of software designed for increasingly sophisticated pattern recognition, data mining, and data analytics are also going to be rather well-suited to the GPU architecture. If, in five years, all the interesting software requires data parallelism, graphics processors are likely to be the commodity hardware solution. So get those pompoms ready.

—–

As always, comments about HPCwire are welcomed and encouraged. Write to me, Michael Feldman, at [email protected].

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

SIA Recognizes Robert Dennard with 2019 Noyce Award

November 12, 2019

If you don’t know what Dennard Scaling is, the chances are strong you don’t labor in electronics. Robert Dennard, longtime IBM researcher, inventor of the DRAM and the fellow for whom Dennard Scaling was named, is th Read more…

By John Russell

Leveraging Exaflops Performance to Remediate Nuclear Waste

November 12, 2019

Nuclear waste storage sites are a subject of intense controversy and debate; nobody wants the radioactive remnants in their backyard. Now, a collaboration between Berkeley Lab, Pacific Northwest National University (PNNL Read more…

By Oliver Peckham

Using HPC and Machine Learning to Predict Traffic Congestion

November 12, 2019

Traffic congestion is a never-ending logic puzzle, dictated by commute patterns, but also by more stochastic accidents and similar disruptions. Traffic engineers struggle to model the traffic flow that occurs after accid Read more…

By Oliver Peckham

Mira Supercomputer Enables Cancer Research Breakthrough

November 11, 2019

Dynamic partial-wave spectroscopic (PWS) microscopy allows researchers to observe intracellular structures as small as 20 nanometers – smaller than those visible by optical microscopes – in three dimensions at a mill Read more…

By Staff report

IBM Adds Support for Ion Trap Quantum Technology to Qiskit

November 11, 2019

After years of percolating in the shadow of quantum computing research based on superconducting semiconductors – think IBM, Rigetti, Google, and D-Wave (quantum annealing) – ion trap technology is edging into the QC Read more…

By John Russell

AWS Solution Channel

Making High Performance Computing Affordable and Accessible for Small and Medium Businesses with HPC on AWS

High performance computing (HPC) brings a powerful set of tools to a broad range of industries, helping to drive innovation and boost revenue in finance, genomics, oil and gas extraction, and other fields. Read more…

IBM Accelerated Insights

Tackling HPC’s Memory and I/O Bottlenecks with On-Node, Non-Volatile RAM

November 8, 2019

On-node, non-volatile memory (NVRAM) is a game-changing technology that can remove many I/O and memory bottlenecks and provide a key enabler for exascale. That’s the conclusion drawn by the scientists and researcher Read more…

By Jan Rowell

IBM Adds Support for Ion Trap Quantum Technology to Qiskit

November 11, 2019

After years of percolating in the shadow of quantum computing research based on superconducting semiconductors – think IBM, Rigetti, Google, and D-Wave (quant Read more…

By John Russell

Tackling HPC’s Memory and I/O Bottlenecks with On-Node, Non-Volatile RAM

November 8, 2019

On-node, non-volatile memory (NVRAM) is a game-changing technology that can remove many I/O and memory bottlenecks and provide a key enabler for exascale. Th Read more…

By Jan Rowell

MLPerf Releases First Inference Benchmark Results; Nvidia Touts its Showing

November 6, 2019

MLPerf.org, the young AI-benchmarking consortium, today issued the first round of results for its inference test suite. Among organizations with submissions wer Read more…

By John Russell

Azure Cloud First with AMD Epyc Rome Processors

November 6, 2019

At Ignite 2019 this week, Microsoft's Azure cloud team and AMD announced an expansion of their partnership that began in 2017 when Azure debuted Epyc-backed ins Read more…

By Tiffany Trader

Nvidia Launches Credit Card-Sized 21 TOPS Jetson System for Edge Devices

November 6, 2019

Nvidia has launched a new addition to its Jetson product line: a credit card-sized (70x45mm) form factor delivering up to 21 trillion operations/second (TOPS) o Read more…

By Doug Black

In Memoriam: Steve Tuecke, Globus Co-founder

November 4, 2019

HPCwire is deeply saddened to report that Steve Tuecke, longtime scientist at Argonne National Lab and University of Chicago, has passed away at age 52. Tuecke Read more…

By Tiffany Trader

Spending Spree: Hyperscalers Bought $57B of IT in 2018, $10B+ by Google – But Is Cloud on Horizon?

October 31, 2019

Hyperscalers are the masters of the IT universe, gravitational centers of increasing pull in the emerging age of data-driven compute and AI.  In the high-stake Read more…

By Doug Black

Cray Debuts ClusterStor E1000 Finishing Remake of Portfolio for ‘Exascale Era’

October 30, 2019

Cray, now owned by HPE, today introduced the ClusterStor E1000 storage platform, which leverages Cray software and mixes hard disk drives (HDD) and flash memory Read more…

By John Russell

Supercomputer-Powered AI Tackles a Key Fusion Energy Challenge

August 7, 2019

Fusion energy is the Holy Grail of the energy world: low-radioactivity, low-waste, zero-carbon, high-output nuclear power that can run on hydrogen or lithium. T Read more…

By Oliver Peckham

Using AI to Solve One of the Most Prevailing Problems in CFD

October 17, 2019

How can artificial intelligence (AI) and high-performance computing (HPC) solve mesh generation, one of the most commonly referenced problems in computational engineering? A new study has set out to answer this question and create an industry-first AI-mesh application... Read more…

By James Sharpe

Cray Wins NNSA-Livermore ‘El Capitan’ Exascale Contract

August 13, 2019

Cray has won the bid to build the first exascale supercomputer for the National Nuclear Security Administration (NNSA) and Lawrence Livermore National Laborator Read more…

By Tiffany Trader

DARPA Looks to Propel Parallelism

September 4, 2019

As Moore’s law runs out of steam, new programming approaches are being pursued with the goal of greater hardware performance with less coding. The Defense Advanced Projects Research Agency is launching a new programming effort aimed at leveraging the benefits of massive distributed parallelism with less sweat. Read more…

By George Leopold

AMD Launches Epyc Rome, First 7nm CPU

August 8, 2019

From a gala event at the Palace of Fine Arts in San Francisco yesterday (Aug. 7), AMD launched its second-generation Epyc Rome x86 chips, based on its 7nm proce Read more…

By Tiffany Trader

D-Wave’s Path to 5000 Qubits; Google’s Quantum Supremacy Claim

September 24, 2019

On the heels of IBM’s quantum news last week come two more quantum items. D-Wave Systems today announced the name of its forthcoming 5000-qubit system, Advantage (yes the name choice isn’t serendipity), at its user conference being held this week in Newport, RI. Read more…

By John Russell

Ayar Labs to Demo Photonics Chiplet in FPGA Package at Hot Chips

August 19, 2019

Silicon startup Ayar Labs continues to gain momentum with its DARPA-backed optical chiplet technology that puts advanced electronics and optics on the same chip Read more…

By Tiffany Trader

Crystal Ball Gazing: IBM’s Vision for the Future of Computing

October 14, 2019

Dario Gil, IBM’s relatively new director of research, painted a intriguing portrait of the future of computing along with a rough idea of how IBM thinks we’ Read more…

By John Russell

Leading Solution Providers

ISC 2019 Virtual Booth Video Tour

CRAY
CRAY
DDN
DDN
DELL EMC
DELL EMC
GOOGLE
GOOGLE
ONE STOP SYSTEMS
ONE STOP SYSTEMS
PANASAS
PANASAS
VERNE GLOBAL
VERNE GLOBAL

Intel Confirms Retreat on Omni-Path

August 1, 2019

Intel Corp.’s plans to make a big splash in the network fabric market for linking HPC and other workloads has apparently belly-flopped. The chipmaker confirmed to us the outlines of an earlier report by the website CRN that it has jettisoned plans for a second-generation version of its Omni-Path interconnect... Read more…

By Staff report

Kubernetes, Containers and HPC

September 19, 2019

Software containers and Kubernetes are important tools for building, deploying, running and managing modern enterprise applications at scale and delivering enterprise software faster and more reliably to the end user — while using resources more efficiently and reducing costs. Read more…

By Daniel Gruber, Burak Yenier and Wolfgang Gentzsch, UberCloud

Dell Ramps Up HPC Testing of AMD Rome Processors

October 21, 2019

Dell Technologies is wading deeper into the AMD-based systems market with a growing evaluation program for the latest Epyc (Rome) microprocessors from AMD. In a Read more…

By John Russell

Intel Debuts Pohoiki Beach, Its 8M Neuron Neuromorphic Development System

July 17, 2019

Neuromorphic computing has received less fanfare of late than quantum computing whose mystery has captured public attention and which seems to have generated mo Read more…

By John Russell

Rise of NIH’s Biowulf Mirrors the Rise of Computational Biology

July 29, 2019

The story of NIH’s supercomputer Biowulf is fascinating, important, and in many ways representative of the transformation of life sciences and biomedical res Read more…

By John Russell

Xilinx vs. Intel: FPGA Market Leaders Launch Server Accelerator Cards

August 6, 2019

The two FPGA market leaders, Intel and Xilinx, both announced new accelerator cards this week designed to handle specialized, compute-intensive workloads and un Read more…

By Doug Black

When Dense Matrix Representations Beat Sparse

September 9, 2019

In our world filled with unintended consequences, it turns out that saving memory space to help deal with GPU limitations, knowing it introduces performance pen Read more…

By James Reinders

With the Help of HPC, Astronomers Prepare to Deflect a Real Asteroid

September 26, 2019

For years, NASA has been running simulations of asteroid impacts to understand the risks (and likelihoods) of asteroids colliding with Earth. Now, NASA and the European Space Agency (ESA) are preparing for the next, crucial step in planetary defense against asteroid impacts: physically deflecting a real asteroid. Read more…

By Oliver Peckham

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This