The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
September 29, 2006
The commoditization of high performance computing has driven the expansion of this market rather dramatically in recent years. IDC reported year-to-year growth of 23 percent in 2005. High-volume x86 processors, memory chips, and open source software are all working to reduce the price of supercomputing. However, other forces are at work that are driving costs in the other direction. One aspect that has gotten much press lately is the increasing amounts of energy and cooling required to support all this "cheap" computing.
A significant portion of that energy is used to compute floating point operations -- the heart of high performance technical computing -- using relatively inefficient general-purpose processors. Thus the recent interest in floating point acceleration from ClearSpeed coprocessors, graphics processing units (GPUs) and the Cell processors. While much attention has been focused on the latter two in recent months, ClearSpeed is seen by some as the dark horse in the race to better floating point performance. According to Stephen McKinnon, ClearSpeed's new COO, the company is uniquely focused on coprocessor floating point acceleration for the HPC marketplace and believes it has the roadmap to keep it ahead of potential rivals for the foreseeable future.
"We're in a position where there are lots of companies sniffing around our technologies," observes McKinnon. "Some of them are doing it because they're tire-kickers, others are doing it because they want to keep their finger on the pulse of what's going on and some are doing it because they have a clear and present need for better technology. I've been working very diligently with my business development organization to make sure we make we figure out which prospect is which."
ClearSpeed's commodity rivals, GPUs and the Cell processor, represent high-volume, low-cost solutions that offer outstanding levels of floating point performance. But from ClearSpeed's perspective, it sees its own acceleration technology much differently from these other two solutions. Using the performance-per-watt mantra, the company is attempting to distance itself from its commodity competition.
McKinnon points out that the fundamental architectures of GPUs and the current Cell implementation are single precision floating point, not double precision, which is the standard in the HPC world. In addition, they are not low-power devices. Some of the latest ATI GPUs consume hundreds of watts of electricity.
"If you want 250 frames per second in 32-bit/pixel graphics so you can see your monsters explode and see the blood fly all over the screen -- gorgeous," McKinnon exclaims. "But if you want to put several thousand of these in a machine room doing seismic data exploration, it's the wrong device. You'll boil up every power supply you've got."
"GPUs are called GPUs because they're graphic processing units," adds McKinnon. "They have been beautifully designed for outstanding graphics acceleration. And you can use them to do floating point acceleration. But they aren't designed to do that and they don't do it half as well as a dedicated device does. In the same way, you wouldn't use a ClearSpeed board to do graphics acceleration."
In the case of the Cell, McKinnon believes it is actually less well-designed for acceleration than GPUs because it's a stand-alone processor. While ClearSpeed provides a dedicated coprocessor for CPU speed-up, the Cell code must be run independently from the host, complicating the software model.
"Cell has a similar legacy," says McKinnon. "Again, a great device for doing what it was designed to do. And obviously you can use it in other environments, but that's not what it was designed for."
ClearSpeed's Advance board, which hosts dual CSX600 coprocessors, represents the company's current HPC accelerator offering. Each CSX600 provides 25 GFLOPS of single or double precision performance while using only 10 watts of power. The coprocessor contains an array of 96 processing elements, each containing multiple processing units that have a high level of internal instruction and data parallelism. The Advance board provides 50 GFLOPS of performance and dissipates approximately 25 watts.
Page: 1 of 3(Digg, Technorati, more)
PGI Accelerator™ Fortran 95/03 and C99 compilers for x64+NVIDIA
Accelerate applications on x64+GPU platforms by adding OpenMP-like compiler directives to existing Fortran and C programs. Available now for Linux, MacOS and Windows. Download a free 15 day trial.
Platform HPC Workgroup Manager
Platform HPC Workgroup Manager integrates all the cluster productivity tools you need to deploy, run and manage your HPC environment.
Mar 19 | OfficialWire | New super to support intelligence work Down Under. Read more...
Mar 18 | ChannelWeb | Westmere parts already showing up in HPC machines. Read more...
Mar 17 | The Register | But what about the tier ones? Read more...
Mar 17 | Cadalyst Magazine | A new generation of workstations is changing the nature of technical computing. Read more...
Mar 17 | Linux Magazine | Latest iteration of Sun Grid Engine able to tap into Cloud. Read more...
Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.
Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.
Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.
Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.
LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html