HPCwire

Leading HPC
Solution Providers


























HPCwire >> Off the Wire

ClearSpeed Updates Advance Product Family


Updated Product Line Includes Performance and Functional Enhancements for CSXL Libraries, the Advance e620 PCIe Accelerator and the ClearSpeed Visual Profiler Toolset

BRISTOL, United Kingdom, May 1 -- ClearSpeed Technology, the world leader in acceleration technology for high performance computing (HPC), today announced new software and hardware enhancements to its Advance product family. The new offerings include performance and functionality enhancements to ClearSpeed CSXL software libraries, the Advance e620 PCI Express (PCIe) accelerator and the ClearSpeed Visual Profiler. Benchmarks using these enhanced CSXL libraries consolidate ClearSpeed's leadership in energy efficiency by delivering 20 times the performance per watt compared with industry standard servers when running the high performance LINPACK Benchmark(1).

The new 2.50 release of ClearSpeed's CSXL acceleration libraries introduces native support for Microsoft Windows and simplifies deployment with documentation updates and End User License Agreements. It provides a number of performance enhancements to the core linear algebra routines for matrix multiplication. Also included in the 2.50 release are the new ClearSpeed Vector Math Library and ClearSpeed Random Number Generators that support additional functionality such as Monte Carlo simulation for option pricing in the financial services industry. Performance comparisons based on benchmark code for European Option pricing provided by a major international bank showed up to 20 times performance speedup using a ClearSpeed Advance accelerator compared with an industry server(2). The use of multiple Advance accelerators in the system delivered up to 100 times performance speedup.

For scientific applications such as molecular modeling, recent results have demonstrated real-world application acceleration of between 3.4 to 9.4 times the speedup with AMBER modules and 4.5 times the speedup with the Bristol University Docking Engine (BUDE) program(3).

On April 27 Cambridge Healthtech Institute's Bio-IT World announced that ClearSpeed Technology was one of three Best of Show finalists for the Information Technology Infrastructure category. Executive Editor of Bio-IT World John Russell will present the awards at the ceremony at 6:15 p.m. ET on May 1 at the Bio-IT World Conference & Expo in Boston.

"Large consumers of compute power are looking for ways to improve both their system performance and performance per watt," said Steve Conway, research vice president of technical computing systems at IDC. "There is strong and increasing interest in acceleration technologies that could deliver improved performance without exceeding power, cooling and facilities constraints. ClearSpeed's acceleration technology is making advances in this area."

Building on the success of ClearSpeed's current PCI-X-based Advance X620 accelerator, the introduction of the complementary and smaller form factor PCIe-based Advance e620 accelerator brings all the benefits of ClearSpeed's acceleration technology to the latest generation of multi-core industry standard servers that incorporate the PCIe standard. Together the existing Advance X620 and the Advance e620 significantly increase the number of server platforms that can take advantage of ClearSpeed acceleration.

For developers, the new ClearSpeed Visual Profiler toolset provides that insight at every level of the system, including the interactions between multiple host processors and one or more ClearSpeed Advance accelerator boards. By delivering a consistent visual representation across the entire system, it provides the best possible environment in which to develop code that will perform optimally in today's multi-core and heterogeneous accelerated systems.

"The world's leading financial institutions and research organizations that depend upon the availability of compute power to maintain their competitive edge are struggling with the constraints of facilities space, power and cooling," said Stephen McKinnon, ClearSpeed's chief operating officer. "The enhancements to our product family are delivering three, five or even twenty times the application performance of unaccelerated systems, while adding less than five percent to the total energy bill. Acceleration technology is causing a radical rethink of datacenter design."

Performance Results

(1) LINPACK performance and performance per watt results

Comparative results

Accelerated cluster: 218.9 percent performance of standard system
Accelerated cluster: 53.6 percent less energy per job
Accelerated cluster: 5.3 percent more power (peak)
Accelerated cluster: 1.6 percent more power (average)

Standard node: 0.07 GFLOPS per watt
Accelerated node: 0.14 GFLOPS per watt, 2x energy efficiency of standard node
ClearSpeed X620: 1.37 GFLOPS per watt, 20x energy efficiency of standard node
ClearSpeed "Top Up": 4.95 GFLOPS per watt, 70x energy efficiency of standard node

ClearSpeed "Top Up" is defined as the additional performance delivered
for the additional average power consumption when compared with an
unaccelerated system.

Measured benchmark results

Standard Cluster: 114.8 GFLOPS, 40.8 minutes runtime
Power: 1900w peak, 1722w average, Energy: 0.29kWhr, 0.07 GFLOPS/w

ClearSpeed Accelerated Cluster: 251.3 GFLOPS, 18.7 minutes runtime
Power: 2000w peak, 1750w average, Energy: 0.14kWhr, 0.14 GFLOPS/w

Standard Node
Node: 28.7 GFLOPS, 431w, 0.07 GFLOPS/w - base energy efficiency

ClearSpeed Accelerated Node
Node: 62.8 GFLOPS, 438w, 0.14 GFLOPS/w - 2x base energy efficiency

ClearSpeed Advance X620 accelerator
X620: 34.1 GFLOPS, 25w, 1.37 GFLOPS/w - 20x base energy efficiency

ClearSpeed "Top Up" additional performance for additional power
X620: 34.1 GFLOPS, 6.9w, 4.95 GFLOPS/w - 70x base energy efficiency

The LINPACK Benchmark was introduced by Jack Dongarra. It is used to solve a dense system of linear equations. For the Top500, a version of the benchmark is used that allows the user to scale the size of the problem and to optimize the software in order to achieve the best performance for a given machine. This performance does not reflect the overall performance of a given system, as no single number ever can. It does, however, reflect the performance of a dedicated system for solving a dense system of linear equations. Since the problem is very regular, the performance achieved is quite high, and the performance numbers give a good correction of peak performance. A parallel implementation of the LINPACK Benchmark and instructions on how to run it can be found at http://www.netlib.org/benchmark/hpl/.

System specifications

Base system: HP DL380 G5, CPU: Intel Xeon 5160 (Woodcrest) x 2 @ 3GHz
Memory: 14GB, Operating System: RedHat EL4 64
ClearSpeed Acceleration: Advance X620, CSXL 2.24, BLAS: Intel MKL 8.1.1
LINPACK parameters: Host assist: 25 percent, HPL.dat: N: 75000, NB: 1152

Standard cluster: 4 nodes, 0 ClearSpeed Advance X620
ClearSpeed accelerated cluster 4 nodes, 4 ClearSpeed Advance X620 accelerator boards

(2) Monte Carlo Simulation

Statistical methods such as Monte Carlo simulation are used by financial institutions to derive future prices of complex option models that cannot be easily modeled by algorithmic approaches such as the Black-Scholes model. ClearSpeed chose to demonstrate Monte Carlo simulation for European options so that both the acceleration could be demonstrated as well as the accuracy of the result when compared with the Black-Scholes method. The benchmark code was supplied by a well known global banking organization.

Monte Carlo simulation for European option pricing.

1 CPU, no acceleration: 400M samples, 60 seconds, Speedup 1x
1 Advance board: 400M samples, 2.9 seconds, Speedup 20x
2 Advance boards: 400M samples, 1.5 seconds, Speedup 40x
4 Advance boards: 400M samples, 0.8 seconds, Speedup 79x

System specifications

Base System: Dell 2880, CPU: 2 x 3.0GHz Xeon , Memory: 3 GB
ClearSpeed Acceleration: 1 to 4 ClearSpeed Advance X620
Host Compiler: gcc, libraries: Randc, random number generator: CGaussian
ClearSpeed Advance X620 Libraries: CS VML & CS RNG

(3) AMBER and Bristol University Docking Engine (BUDE) Performance Results

AMBER

To demonstrate application level performance of accelerated systems we have modified a set of Amber 9 methods to take advantage of ClearSpeed's Advance accelerator board. This includes the effective radius and force calculation of AMBER's Generalized Born (GB) models, 1, 2, and 6. Supported options include constant pH7 and analytical linearized Poission Boltzmann (ALPB) as well as options that do not directly change the force calculation, including NMR restraints.

While the genborn module of Amber is a small part of the sander executable, it typically amounts for 95-97 percent of the CPU compute time for GB simulations. The CPU compute time is mainly spent in three loops: effective radii calculations, diagonal and off-diagonal force calculations.

The overall structure of the code was maintained. A thin layer written in C, using ClearSpeed's CSAPI library, was added to handle the communication between the host and board.

Generalized born 1 Minutes 83.5 (Host) 24.6 (Advance X620) 3.39 (Speedup)
Generalized born 2 Minutes 84.6 (Host) 23.5 (Advance X620) 3.60 (Speedup)
Generalized born 6 Minutes 37.9 (Host) 4.0 (Advance X620) 9.35 (Speedup)

Host: 2.8GHz Pentium 4 EMT64, OS: RHEL4-64, CSXL: version 2.50

Bristol University Docking Engine (BUDE)

1 host CPU, no acceleration: 48.2 seconds, Speedup 1.0x
1 Advance board: 10.6 seconds, Speedup 4.5x
2 Advance boards: 5.8 seconds, Speedup 8.3x
3 Advance boards: 4.4 seconds, Speedup 11.0x

Host: 2 x 2.8 GHz Xeon, OS RHEL4-64 ,CSXL version 2.24

About ClearSpeed

ClearSpeed Technology is a semiconductor company that develops massively parallel coprocessors, accelerator boards and software that deliver unmatched performance per watt for high performance computing applications in financial services, universities and national labs. ClearSpeed has offices in San Jose, California, and Bristol, UK and has 84 patents granted and pending. For more information, visit www.clearspeed.com.

-----

Source: ClearSpeed Technology


Article Tools

  • Print This Article

Share & Save Options

Discussion

There are 0 discussion items posted.  

Sponsored Links

Cray at SC08 – Celebrating Innovation
Visit us at booth #532 and see the latest technology from Cray, including the new Cray XT5 system with ECOphlex technology and the recently introduced Cray CX1 desk side supercomputer.

Visit IBM at SC08 - Experience the latest breakthroughs in High Performance Computing
As the world's leading provider of high performance computing solutions, IBM will showcase Exascale Stream Processing, Cloud Computing, Blue Brain, Interactive Ray Tracing along with many other exciting demos.

Harness the power of Sun to solve your most complex problems
Beat your competition by getting to market first, running more simulations, and solving complex problems with Sun HPC Systems. Sun HPC: Open, Simple, Reliable.



Feature Articles

Complete Genomics Takes Off

Last week, San-Francisco-based Complete Genomics came out of stealth mode to become the first provider of large-scale human genome sequencing services. HPCwire recently asked company representatives a few questions about their new offering.
Read More...

Intel Grabs NetEffect Assets, Becomes iWARP Player

Intel has acquired the assets of NetEffect, an Austin-based company that makes iWARP-capable adapters. Intel will inherit NetEffect's product portfolio, which includes 1 and 10 GbE accelerated adapters, 10 GbE adapters for blade configurations as well as a 10 GbE ASIC.
Read More...

Woven Launches New 10 GbE Switch

Woven Systems has added a new 10 Gigabit Ethernet top-of-rack switch to its product lineup. The TRX 200 is aimed at high performance datacenter environments requiring a scalable Ethernet fabric.
Read More...

Top Headlines

Hazy Computing

Oct 15 | Linux Magazine | Today machines manage what we cannot. Are we dependent upon results or processes we do not understand? Read more...

Reaching For the Exa-Scale

Oct 15 | International Science Grid This Week | Exa-scale computing is probably years away. But GPUs and volunteer grids may provide a shortcut. Read more...

New Visualization Laboratory Debuts on UT Austin's Main Campus

Oct 14 | Texas Advanced Computing Center | TACC has unveiled a new visualization laboratory capable of reproducing terascale data sets with exceptional clarity and resolution. Read more...

High-Performance Nonsense

Oct 13 | Computerworld | Microsoft will have to overcome Windows' historical baggage if its new HPC Server 2008 offering is to be acceptable to users. Read more...

ORNL's Breakthroughs in Cray Machines Make it Hard to Beat

Oct 13 | Knoxville News Sentinel | Oak Ridge National Laboratory has petaflop computing in sight as it upgrades its 'Jaguar' supercomputer. Read more...

Featured Whitepapers

Panasas® Tiered Parity™ Architecture

Sep 04 | | Disk drives are approximately 250 times denser today than a decade ago. This is good news for users who are creating, manipulating and storing more data than ever before. It gives them an opportunity to derive more value from their stored data and lowers the capital acquisition and operating expense associated with that data.

SUSE® Linux Enterprise Server for High Performance Computing

Sep 05 | | The excellent scalability features of Linux, in addition to robust security and performance makes it an excellent choice for server systems, especially in the high performance computing area.

Multimedia

Video White Paper: Architecting a Better Network Storage Solution

BlueArc's Titan architecture represents an evolutionary step in file servers by creating a hardware-based file system that can scale bandwidth, IOPS, and overall data capacity well beyond conventional software-based devices. With its ability to virtualize a massive storage pool of up to four usable petabytes of tiered storage, Titan can scale with growing data requirements, offering a competitive advantage for businesses, researchers, or other enterprises seeking to better manage data growth while still ensuring optimal performance.

High Performance on Wall Street

Newsletters

Stay informed! Subscribe to HPCWire email Newsletters.

Get updates and insights on the High Productivity Computing industry delivered driectly to your inbox.





HPC Job Bank

Featured Events

SIFMA
HP-CAST
2008 Virtualization Conference & Expo
World Summit of Cloud Computing
Symposium 2009