The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
May 26, 2006
ASPEED Software has ported its distributed computing solution, ACCELLERANT, to IBM's Blue Gene platform. With this solution, ASPEED will provide developers a means to introduce parallelism into applications for the Blue Gene architecture, without modifying the underlying algorithms or data structures. In particular, ACCELLERANT will become the first high-level solution that enables compute-intensive financial codes to take advantage of Blue Gene's supercomputing performance.
ASPEED characterizes ACCELLERANT as an application additive. There's no server or control program. You just add ACCELLERANT instrumentation into your source code. The application's algorithms and data structures are unaffected. In this sense, ACCELLERANT acts as a high-level language enhancement for parallelism support. Interfaces to most major programming languages (C, C++, FORTRAN, Java, VBA and VB.NET) are available.
Kurt Ziegler, ASPEED's executive vice president of Marketing and Product Management, says that many of the Wall Street applications are typically written as SMP codes and have not been designed to run in a distributed computing environment like a cluster or a Blue Gene supercomputer, which in many ways is just a tightly wound cluster. But a lot of these applications -- everything from payroll processing to derivative trading analysis -- have plenty of untapped parallelism that can be exploited on a distributed processor architecture. And according to Ziegler, ACCELLERANT is particularly well suited to efficiently distributing compute-intensive financial services applications because of the way it supports loop parallelization.
"What we do is take single-thread or multi-thread code and transparently parallelize the loops such that the computation can be distributed across multiple CPUs within the box or across multiple boxes," says Ziegler.
To demonstrate how ACCELLERANT could scale a financial application, ASPEED used a single-thread portfolio pricing application, called AMBook. The application performs a pricing exercise on each portfolio option -- derivatives in this case -- to determine risk and pricing choices. In the end, you get the valuation of the whole portfolio, which may contain hundreds of thousands of options.
AMBook essentially loops through the options and performs a stochastic pricing/risk calculation one each one. The non-deterministic nature of the stochastic calculation requires a great deal of computational power. This technique is used extensively in things such as options pricing, Monte Carlo analysis, and portfolio valuation/risk management applications.
So how does ACCELLERANT do it? Since individual options are independent from one another, the entire portfolio calculation can be parallelized by partitioning the options loop across multiple CPUs. To do this, you add ACCELLERANT instrumented code (what they call "adapters") around the loop. The adapters get turned into ACCELLERANT library calls that slice up the code and distribute them across multiple CPUs. So if you have a thousand iterations and wanted to run it across ten physical CPU cores, the ACCELLERANT software would automatically spawn ten different copies of that loop with different loop-range parameters.
"This is not a rewrite tool," says Ziegler. "We're not restructuring. Whatever your math or iteration was before, logically it will remain the same. We don't want to alter the guy's thinking. The only difference is that at execution time, we'll be running it across multiple CPUs, transparently and adaptively."
But it's not just a static scheduling algorithm. The code distribution across the CPUs also gets rebalanced while the program is running.
"The magic is that we do this dynamically," says Ziegler. "We're constantly monitoring the progress of the calculations. If one of the machines happens to be slower or the calculation happens to be more complex and is not getting as many [completed], it will rebalance the work so that all of it ends at the same time. So we're more like a real-time dispatcher."
Page: 1 of 3(Digg, Technorati, more)
Jul 09 | Engineer Live | The demand for computational tools to underpin the 3D seismic interpretation process has never been more apparent. Read more...
Jul 08 | EE Times | Unemployment for U.S. engineers has reached record levels, according to government figures. Read more...
Jul 08 | Network World | Global spending for 2009 projected to drop 6 percent, for a total of $3.2 trillion. Read more...
Jul 08 | Linux Magazine | Portability or efficiency? Neither is guaranteed when writing explicit parallel code. Read more...
Jul 07 | Ars Technica | Japanese company builds custom ASIC to accelerate real-time ray traced rendering for the auto industry. Read more...
Apr 14 | | Many HPC IT departments are feeling the rising pressure to deliver more capacity computing and performance while trying to reduce the total cost of ownership. This white paper discusses how an environmentally-friendly and open-standards HPC building block based computing system using flexible interconnect options helps address capacity computing needs.
Source: Addison Snell, GM/VP, Tabor Research; sponsored by Dell
Many organizations that could benefit from the use of HPC clusters find that it is complicated to get the systems up and running because of limited IT resources or the complexities of the clusters themselves. Learn how the Intel Cluster Ready program, for which Dell was an original partner, seeks to address this challenge for entry level and mid-range HPC users.
BlueArc's Titan architecture represents an evolutionary step in file servers by creating a hardware-based file system that can scale bandwidth, IOPS, and overall data capacity well beyond conventional software-based devices. With its ability to virtualize a massive storage pool of up to four usable petabytes of tiered storage, Titan can scale with growing data requirements, offering a competitive advantage for businesses, researchers, or other enterprises seeking to better manage data growth while still ensuring optimal performance.
Sun Studio Compilers and Tools and Sun HPC ClusterTools allow you to create high performance parallel applications for OpenSolaris, Solaris and Linux. Sun Studio Express 11/08 includes MPI performance analysis capabilities and full OpenMP 3.0 compiler support. Learn about all this and the latest in Sun HPC ClusterTools 8.1.