The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
October 27, 2006
HPC and modern computing in general has a seemingly insatiable demand for more performance, better efficiency and scalability. Now with the expansion of computing to practically every commercial and non-commercial endeavor, an additional requirement is to apply these attributes to a much wider range of applications.
"So there's a need there for more types of processing," says Jeff Jussel, vice president of marketing and general manager of the Americas for Celoxica. "There's a number of ways the industry is addressing that -- with massively parallel processing (MPP) and with all sorts of different types of co-processors. At Celoxica, we believe that the FPGA represents a huge opportunity for co-processing, because it can deliver the massive parallelization, with the advantages of custom hardware, but in a way that is programmable."
But there are three things that you need in order for FPGAs to really take off:
"That's where the tools that Celoxica provides comes in," says Jussel. "That's our mission in life -- to make that FPGA programming transparent. And in doing so, enable FPGA use for the high performance computing market."
As part of this strategy, this week Celoxica announced a new off-the-shelf hardware and software compiler design bundle for high performance computing using HyperTransport (HTX) slots. The HTX bundle combines an intellectual property (IP) core for HTX connectivity, an FPGA-based HTX acceleration card and a software programming environment. The solution is designed to allow users to accelerate applications in Opteron-based computing systems with FPGA co-processing and HyperTransport technology. The bundle provides compilers that map C code onto FPGA hardware, a run-time OS (RTOS) for FPGA computing, and FPGA hardware that plugs into a host server system.
The hardware consists of the RCHTX acceleration card, which includes two Xilinx Virtex 4 FPGAs devices (in the future it will support more advanced Virtex 5 devices), 24 MB of QDR SRAM, and a range of I/O. The main co-processor FPGA is a 16 million gate device that is meant to run the user algorithms. The second FPGA is configured as a bridge, containing an HTX IP core developed by Celoxica. The bridge FPGA and IP provide the HyperTransport interconnect between the FPGA co-processor and the host processor system and memory space.
The software component consists of the DK Design Suite, which includes a C compiler for programming the FPGA co-processor, a board support package (BSP) and data communications drivers for the RCHTX card, a basic floating point library (single and double precision) and the software API which provides the interfaces.
The idea is not for the user to port their whole application to the FPGAs, just the compute-intensive algorithms that represent the workload bottlenecks. For example FFT calculations, a Black Scholes algorithm or wave migration calculations can be offloaded to the FPGA to take advantage of the parallel hardware resources.
The user replaces the algorithm loop in the original FORTRAN or C source with a Celoxica API call, which calls the C code that will be compiled into the FPGA. Jussel says the original algorithm needs to be "tweaked" somewhat to insert parallelism, but they've tried to simplify this as much as possible. The FPGA C compiler brings in the appropriate run-time pieces to make it work in its new hardware environment. At execution time, the data communication between the host processor and the FPGA is done across the HyperTransport connection, but this is transparent to the user.
Jussel notes that the product announced this week represents the first FPGA solutions that uses the HTX slot. DRC Computer Corporation has a somewhat similar solution, where its FPGA uses an Opteron socket to directly connect to HyperTransport. As it turns out, DRC is an OEM partner with Celoxica and makes use of the same C compiler technology.
Celoxica's current (beta) customers for the HTX solution are in the financial services, oil & gas, and life sciences industries. With this particular product, users have achieved a 200X performance improvement for the application (offloading a Black Scholes algorithm to the FPGA).
"We've done enough with the finance industry to know that the metric we need to hit is about a 10X price-performance benefit," says Jussel. "If we hit that 10X factor then it's worth it for them to invest in new technology. And we've been able to show quite a bit greater than 10X for all these applications."
Even though this HTX product includes the FPGA card, hardware is not Celoxica's main focus. The company's real goal is to be the leader in compilers and the run-time support for FPGAs. Jussel says that compared to other FPGA compiler companies, Celoxica is quite a bit larger and more established, having developed and matured its software technology over the past 10 years. It provides the compiler for the SGI RASC RC100 system as well as Cray FPGA systems, not to mention its large customer base in the embedded computing space -- still the majority of their business. But because FPGAs have this unique aspect of reconfigurability and high performance, the company believes that these devices will become ubiquitous throughout computing. And Celoxica wants to be there with their software.
Says Jussel: "We really want to be the Microsoft of FPGA computing. We want to provide the compilers and RTOS for that solution."
PSSC Labs PowerWulf Clusters Custom Configured HPC Solutions for Your Needs and Budget
PSSC Labs stands for Professional Service, Super Computers. Our mission bring a superior level of service and support to the HPC community.
FREE Download: "Going Parallel - An Implementation Guide"
Breakthrough performance for MATLAB®, Python and other desktop apps... Get 100X speedups, with less than 10% of the development time. Focus is on enabling familiar desktop tools to virtually execute on parallel servers, clusters, and grids.
The size and diversity of the HPC market in the United States supports a varied set of system providers and integrators. But in Europe, and the United Kingdom in particular, the market has a different shape.
Read More...
PRACE to evaluate petaflops prototypes; Acadamic roundtable discusses the computing industry's talent pool; and WRF benchmark data are released. John West recaps those stories and more in our weekly wrap-up.
Read More...
Since the first patent was issued for a Venetian statue in 1471, 60 million patents have been awarded around the world, with four million patents actively in force today worldwide. And 800,000 new inventions are registered every year. While the data is public, current search tools are inconvenient and inadequate to the needs of professionals. Semantic supercomputing techniques are helping researchers tackle this difficult challenge.
Read More...
Sep 05 | Uppsala University | Swedish researchers are revealing that "intelligent" computer-based methods for classifying patient samples is worthless when it comes to practical problems. Read more...
Sep 03 | Telegraph.co.uk | A new form of three dimensional scans could revolutionise brain surgery within a year, doctors claim. Read more...
Sep 03 | Linux Magazine | In HPC, most attention is paid to utilization and performance, rather than service availability and problem notification. This article focuses on the latter Read more...
Sep 03 | Nature News | What does it take to store bytes by the tens of thousands of trillions? Read more...
Sep 01 | Delaware Online | In the world of supercomputer-powered science, speed is everything, and an open road can lead to the promised land. Read more...
Sep 05 | | The excellent scalability features of Linux, in addition to robust security and performance makes it an excellent choice for server systems, especially in the high performance computing area.
Sep 01 | | The paper outlines the basic steps and tools involved in the process of migrating a desktop application to a parallel environment.
Jun 05 | | As pressure increases on the upstream seismic processing community to deliver ever-higher levels of productivity and efficiency, a new generation of storage solutions will be required that allow the maximum utilisation of high-performance computing (HPC) Linux cluster resources, together with the minimum of management overhead.
BlueArc's Titan architecture represents an evolutionary step in file servers by creating a hardware-based file system that can scale bandwidth, IOPS, and overall data capacity well beyond conventional software-based devices. With its ability to virtualize a massive storage pool of up to four usable petabytes of tiered storage, Titan can scale with growing data requirements, offering a competitive advantage for businesses, researchers, or other enterprises seeking to better manage data growth while still ensuring optimal performance.
Today, HPC organizations are requiring substantially more floating point performance to solve real-world problems. In this podcast, Ben Bennett, ClearSpeed General Manager, discusses how acceleration technology can improve the overall performance of standard x86-based systems...
Get updates and insights on the High Productivity Computing industry delivered driectly to your inbox.