The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
January 06, 2009
Do It Yourself Hardware Acceleration
For all the benefits claimed by hardware acceleration, from exponential performance improvements to massive power and space savings, most of these benefits focus on what can be accomplished with little detail on how to accomplish it. Hardware acceleration always seems to have the implied acronym DIY (Do It Yourself).
Most of the time, this either means purchasing someone else's proprietary hardware and software, implementing algorithms at the far end of the system bus, and hoping that the partner's roadmap aligns with your evolving goals. Or it means developing your own boards, custom hardware, custom software, custom interfaces, and custom protocols while maintaining expertise in all of these fields. In this scenario, designers are not just taking advantage of hardware to accelerate their software; they are doing full hardware design, plain and simple.
These have been the obstacles of hardware acceleration for more than a decade and most software developers have found it best to ride Moore's Law of continual improvement, waiting for the next generation processor, rather than venturing into the realm of hardware acceleration.
As has been well publicized, the door to continual improvement has closed, however in turn this has opened the door to hardware acceleration. Hardware acceleration will look vastly different five years from now than it does today. For those who are not watching it closely, it will probably look different next month. Any acceleration path must not only be revolutionary in what it provides, it must also be evolutionary!
From Revolutionary...
Let's start with the revolution! At the forefront of this revolution there has been the responsiveness of AMD and Intel to open their processor interconnects. This allows hardware to move from being an add-on attachment isolated at the far end of a PCI bus or some other distant extension to sitting next to the CPU as an equal. With AMD's Torrenza Initiative and Intel's QuickAssist Technology, hardware accelerators now have a low-latency way to communicate with the processor, as well as direct access to system memory.
On the accelerator side, both AMD and Intel have embraced the idea of in-socket accelerators (ISAs). By taking an FPGA, which essentially is configurable hardware that can be programmed to implement custom functions, and placing it on a board that plugs directly into a processor socket, existing multi-processor systems can now be converted to processor and accelerator systems without new board design. XtremeData's XD2000 modules utilizing Altera's Stratix FPGAs, are an example of how users are allowed to leverage existing boards and systems, whether they are multi-CPU desktops, blade servers, or ATCA cards. Now the entire hardware acceleration platform can be developed with COTS components and boards.
When designing within this revolution, developers may choose to custom build their own bridge for their specific software/hardware interface. Besides requiring extensive work and an understanding of software/hardware co-design, development is now brought down to the physical implementation where the designer's work is locked to a specific processor and in-socket accelerator. This not only inhibits any flexibility for the designer to try different existing architectures, but most importantly, this makes it difficult to upgrade as new processors, bridges, and in-socket accelerators become available.
With any newly embraced technology, things change quickly. Intel's processor interconnect is moving from the Front Side Bus (FSB) to the QuickPath Interconnect (QPI). AMD is riding the evolution of the HyperTransport interconnect technology (HT) standard that is continually being updated. FPGAs are increasing dramatically in density as they move to 40nm process technology and beyond. In such an environment, any development work risks being too specific to a given technology node and being left behind as new process technologies are emerging.
To Evolutionary...
To be evolutionary, a hardware accelerator must not only provide a bridge to the CPU, but one that can evolve with new technology. At the most basic level, hardware accelerators need to cloak the underlying hardware, so the processor talks software, the accelerator talks hardware, and yet they can easily communicate with each other.

(Digg, Technorati, more)
PGI Accelerator™ Fortran 95/03 and C99 compilers for x64+NVIDIA
Accelerate applications on x64+GPU platforms by adding OpenMP-like compiler directives to existing Fortran and C programs. Available now for Linux, MacOS and Windows. Download a free 15 day trial.
Platform HPC Workgroup Manager
Platform HPC Workgroup Manager integrates all the cluster productivity tools you need to deploy, run and manage your HPC environment.
Mar 19 | OfficialWire | New super to support intelligence work Down Under. Read more...
Mar 18 | ChannelWeb | Westmere parts already showing up in HPC machines. Read more...
Mar 17 | The Register | But what about the tier ones? Read more...
Mar 17 | Cadalyst Magazine | A new generation of workstations is changing the nature of technical computing. Read more...
Mar 17 | Linux Magazine | Latest iteration of Sun Grid Engine able to tap into Cloud. Read more...
Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.
Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.
Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.
Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.
LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html