HPCwire

The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing

HPCwire >> Features

Algorithm Leadership


Page:  1  of  4
1 | 2 | 3 | 4   All  »  

Imagine working as a clerk in an electronics store, and a customer comes up to you and says, "I'm very interested in one of your new 8.1 surround sound systems, but it has to be able to play eight-track tapes. I have a lot invested in those tapes, all stacked up in my garage. I'm not buying any of these new audio systems until they play my eight-track tapes in full surround sound."

It sounds too ridiculous to actually happen, but I hear the equivalent almost every day in high performance computing: some customer demands that the most current and advanced hardware runs algorithms that date from the Jimmy Carter administration, and runs them well.

Discarding Dated Algorithms

There have been some amazing developments in computer design (like low-cost vector floating point hardware), but they come with a catch: you'll have to change your idea about "best" algorithm if you really want to stay ahead in this game. In all the talk about how supercomputing helps us compete in the world economy, we often ignore one area: We cling to programming ideas that are decades out of date, and that intellectual inertia holds us back far more than the hardware does. It's not just that we keep legacy software a long time; it's that we hold onto legacy algorithms long past the time when they were a good match for their hardware. Discarding dated algorithms may be one of the most promising ways to advance our productivity.

The thing that makes algorithms go out-of-date is when the relative costs of computing resources changes dramatically. Imagine that the year is 1967. Single processors are very expensive, costing hundreds of dollars per hour. Programmers and users are relatively cheap; they cost a few dollars per hour. Memory is very expensive; you can spend a million dollars for a megabyte of RAM. Memory access is cheap... so cheap that the time it takes is hidden by the time the computer grinds away doing arithmetic. In that era, programming sophisticated algorithms to save memory and FLOPS made a lot of sense. Compare this situation to today, four decades later:

  • Scientists, engineers, and programmers now usually cost much more per hour than the cost per hour of the computer they are using.

  • A megabyte of RAM now costs pennies.

  • A processor now takes hundreds of times longer to read a number from main memory than it does to perform arithmetic on it with fifteen-digit accuracy.

  • Most of the cost in a computer design is in the bandwidth at every level of the memory hierarchy; the actual processing elements have become relatively cheap.

We can point to some specific places where it may be time to change our ideas about those classic "best" algorithms we learned in our computer science courses.

Sparse Equations: Go Ahead, Multiply by Zero

I recently saw a presentation by a researcher struggling to solve a sparse system of linear equations. He assumed that the many weeks he was spending to operate only on the nonzero entries in his system would lead to higher performance. Like all sparse system programmers, his intent was to spare the computer from having to multiply by zero values and add that zero result, an obvious waste of a perfectly good arithmetic unit. To his chagrin and frustration, he discovered he had made his program slower, because now he was spending much more time "pointer chasing"... and he complained bitterly that computer vendors did not know how to build computers correctly because it should not take long to get an index out of memory and use that index to look up the value.

Evidence is mounting that sparse solvers deserve reexamination. A paper published just this month by a prominent HPC researcher studied the use of FPGAs to reduce the time to solve sparse systems. A system of 4000 symmetric equations, 2 percent nonzero elements, took 67.3 seconds to solve on a dual-Xeon server using a sparse solver. After programming the FPGAs in VHDL and using other tools, the author was able to reduce the time to 33.8 seconds. Clever... but I noticed he did not mention how long it would have taken to simply solve it with the Linpack solver on that server, ignoring the sparsity and simply treating it as a dense, nonsymmetric system: Less than three seconds!

Let that soak in for a minute. Even with the unnecessary arithmetic on all those zero elements, even ignoring the symmetry, the "wasteful" dense method is over an order of magnitude faster. The researcher spent thousands of dollars of his time to force-fit an old algorithm into a modern architecture, and wound up with a solution that is over ten times slower and hundreds of times more costly to implement. We are simply optimizing the wrong things. So go ahead, multiply by zero. Complicating the control flow to save scalar operations no longer pays.

Page:  1  of  4
1 | 2 | 3 | 4   All  »  

HPCwire on Twitter

Article Tools

  • Print This Page
  • Bookmark This Article

Share Options

(Digg, Technorati, more)


Subscribe

Discussion

There are 0 discussion items posted.  

HPC in the Cloud Part 2
People to Watch 2010


Top Headlines

Intel Partners See 'Easy' Upgrade Path With Xeon 5600 Chips

Mar 18 | ChannelWeb | Westmere parts already showing up in HPC machines. Read more...

AMD: OEMs primed for Opteron 6100s

Mar 17 | The Register | But what about the tier ones? Read more...

Arrival of the Desktop Supercomputer

Mar 17 | Cadalyst Magazine | A new generation of workstations is changing the nature of technical computing. Read more...

Scheduling HPC In The Cloud

Mar 17 | Linux Magazine | Latest iteration of Sun Grid Engine able to tap into Cloud. Read more...

Tailoring Medicine with Supercomputers

Mar 16 | Bio-IT World | Biotech firm builds genetic models from patient data. Read more...

Featured Whitepapers

Virtualization for Aggregation And The vSMP Architecture™

Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.

Copper Cable Technologies for High Performance Computing

Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.

Multimedia

Webcast: Virtualized Data Center Roundtable

Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.

Webcast: Watch SC09 Birds of a Feather Video: Scalable Fault-Tolerant HPC Supercomputers

Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.

Webcast: High Performance Computing for a Smarter Planet

LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html

SC09 HPC in the Cloud

Newsletters

Stay informed! Subscribe to HPCwire email Newsletters.






HPC Job Bank


Featured Events

HPC User Forum DICE
2010 High Performance Computing Linux Financial Markets
Cloud Computing Expo
Cloud Lab
ESC
DEISA PRACE Symposium