The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
September 29, 2006
The commoditization of high performance computing has driven the expansion of this market rather dramatically in recent years. IDC reported year-to-year growth of 23 percent in 2005. High-volume x86 processors, memory chips, and open source software are all working to reduce the price of supercomputing. However, other forces are at work that are driving costs in the other direction. One aspect that has gotten much press lately is the increasing amounts of energy and cooling required to support all this "cheap" computing.
A significant portion of that energy is used to compute floating point operations -- the heart of high performance technical computing -- using relatively inefficient general-purpose processors. Thus the recent interest in floating point acceleration from ClearSpeed coprocessors, graphics processing units (GPUs) and the Cell processors. While much attention has been focused on the latter two in recent months, ClearSpeed is seen by some as the dark horse in the race to better floating point performance. According to Stephen McKinnon, ClearSpeed's new COO, the company is uniquely focused on coprocessor floating point acceleration for the HPC marketplace and believes it has the roadmap to keep it ahead of potential rivals for the foreseeable future.
"We're in a position where there are lots of companies sniffing around our technologies," observes McKinnon. "Some of them are doing it because they're tire-kickers, others are doing it because they want to keep their finger on the pulse of what's going on and some are doing it because they have a clear and present need for better technology. I've been working very diligently with my business development organization to make sure we make we figure out which prospect is which."
ClearSpeed's commodity rivals, GPUs and the Cell processor, represent high-volume, low-cost solutions that offer outstanding levels of floating point performance. But from ClearSpeed's perspective, it sees its own acceleration technology much differently from these other two solutions. Using the performance-per-watt mantra, the company is attempting to distance itself from its commodity competition.
McKinnon points out that the fundamental architectures of GPUs and the current Cell implementation are single precision floating point, not double precision, which is the standard in the HPC world. In addition, they are not low-power devices. Some of the latest ATI GPUs consume hundreds of watts of electricity.
"If you want 250 frames per second in 32-bit/pixel graphics so you can see your monsters explode and see the blood fly all over the screen -- gorgeous," McKinnon exclaims. "But if you want to put several thousand of these in a machine room doing seismic data exploration, it's the wrong device. You'll boil up every power supply you've got."
"GPUs are called GPUs because they're graphic processing units," adds McKinnon. "They have been beautifully designed for outstanding graphics acceleration. And you can use them to do floating point acceleration. But they aren't designed to do that and they don't do it half as well as a dedicated device does. In the same way, you wouldn't use a ClearSpeed board to do graphics acceleration."
In the case of the Cell, McKinnon believes it is actually less well-designed for acceleration than GPUs because it's a stand-alone processor. While ClearSpeed provides a dedicated coprocessor for CPU speed-up, the Cell code must be run independently from the host, complicating the software model.
"Cell has a similar legacy," says McKinnon. "Again, a great device for doing what it was designed to do. And obviously you can use it in other environments, but that's not what it was designed for."
ClearSpeed's Advance board, which hosts dual CSX600 coprocessors, represents the company's current HPC accelerator offering. Each CSX600 provides 25 GFLOPS of single or double precision performance while using only 10 watts of power. The coprocessor contains an array of 96 processing elements, each containing multiple processing units that have a high level of internal instruction and data parallelism. The Advance board provides 50 GFLOPS of performance and dissipates approximately 25 watts.
Page: 1 of 3(Digg, Technorati, more)
Jul 09 | Engineer Live | The demand for computational tools to underpin the 3D seismic interpretation process has never been more apparent. Read more...
Jul 08 | EE Times | Unemployment for U.S. engineers has reached record levels, according to government figures. Read more...
Jul 08 | Network World | Global spending for 2009 projected to drop 6 percent, for a total of $3.2 trillion. Read more...
Jul 08 | Linux Magazine | Portability or efficiency? Neither is guaranteed when writing explicit parallel code. Read more...
Jul 07 | Ars Technica | Japanese company builds custom ASIC to accelerate real-time ray traced rendering for the auto industry. Read more...
Apr 14 | | Many HPC IT departments are feeling the rising pressure to deliver more capacity computing and performance while trying to reduce the total cost of ownership. This white paper discusses how an environmentally-friendly and open-standards HPC building block based computing system using flexible interconnect options helps address capacity computing needs.
Source: Addison Snell, GM/VP, Tabor Research; sponsored by Dell
Many organizations that could benefit from the use of HPC clusters find that it is complicated to get the systems up and running because of limited IT resources or the complexities of the clusters themselves. Learn how the Intel Cluster Ready program, for which Dell was an original partner, seeks to address this challenge for entry level and mid-range HPC users.
BlueArc's Titan architecture represents an evolutionary step in file servers by creating a hardware-based file system that can scale bandwidth, IOPS, and overall data capacity well beyond conventional software-based devices. With its ability to virtualize a massive storage pool of up to four usable petabytes of tiered storage, Titan can scale with growing data requirements, offering a competitive advantage for businesses, researchers, or other enterprises seeking to better manage data growth while still ensuring optimal performance.
Sun Studio Compilers and Tools and Sun HPC ClusterTools allow you to create high performance parallel applications for OpenSolaris, Solaris and Linux. Sun Studio Express 11/08 includes MPI performance analysis capabilities and full OpenMP 3.0 compiler support. Learn about all this and the latest in Sun HPC ClusterTools 8.1.