A New Engine for Financial Analysis

By Bryce Mackin

May 25, 2007

Ten years ago, symmetric multiprocessing (SMP) and massively parallel processing (MPP) systems were the most common architectures for high performance computing. The popularity of these architectures has decreased with the emergence of a more cost-effective approach: cluster computing. According to the Top500 Supercomputer Sites project, the cluster systems are now the most common type of architecture for the world’s highest performing computer systems.

Cluster computing has achieved this prominence because it is being widely applied in the financial analysis worldwide. Rather than relying on the custom processor elements and proprietary data pathways of SMP and MPP architectures, cluster computing employs commodity standard processors, such as those from Intel and AMD, and uses industry-standard interconnects such as Gigabit Ethernet and InfiniBand.

Applications for this cluster architecture are those that can be “parallelized,” or broken into sections and independently handled by one or more program threads running in parallel on multiple processors. Such applications are widespread in many areas; including the finance sector, where application software is now routinely being delivered in “cluster aware” forms that can take advantage of high performance computing (HPC) architectures.

The Rise of HPC Cluster Computing

Clusters are becoming the preferred HPC architecture because of their cost effectiveness; however, these systems are starting to face challenges. The single-core processors used in these systems are becoming denser and faster, but they are running into memory bottlenecks and dissipating ever-increasing amounts of power. The presence of memory bottlenecks has two components: a limited number of I/O pins on the processor package that can be dedicated for memory access and an increasing latency due to the use of multi-layered memory caches. Higher power consumption is a direct result of increasing clock speeds, and is forcing a need for extensive cooling of the processors.

The increasing power demand of processors is of particular concern in financial datacenters where calculations per watt are increasingly important. System cooling requirements are limiting cluster sizes, and therefore limiting achievable performance levels. Their existing cooling systems are simply running out of capacity and increasing that capacity carries a high price.

The industry’s current solution to these growing problems is a move towards multicore processors. Increasing the number of processor cores in a single package offers increased node performance at somewhat lower power dissipation than that of an equivalent number of single-core processors. But the multicore approach does not address the memory bottlenecks inherent in the packaging.

An alternative approach has arisen: using FPGA-based coprocessors to accelerate the execution of key steps in the application software. This approach is similar to coding an inner loop of a C++ application in assembly language to increase overall execution speed.

FPGAs typically run at slower clock speeds than the latest CPUs, yet they can make up for this with superior memory bandwidth, a high degree of parallelization, and the customization that is possible. An FPGA coprocessor programmed to hardware-execute key application tasks can typically provide a 2X to 3X system performance boost while simultaneously reducing power requirements 40 percent as compared to adding a second processor or even a second core. Fine tuning of the FPGA to application needs can achieve performance increases greater than 10X.

FPGA Acceleration Opportunities

Any solution that boosts HPC applications must cover a wide spectrum, with widely differing computational needs. Specific applications demand a particular combination of mathematical and logical operators, coupled with efficient memory access.

It is difficult for general-purpose CPUs or specialized processor solutions such as graphics processing units (GPUs) or network processors to provide an optimal solution for the broad spectrum of HPC applications. FPGAs, however, are a re-configurable engine. They can be optimized under software control to meet the particular requirements of an HPC application. This allows a single hardware solution to address many HPC applications with equal efficiency.

FPGAs accelerate HPC applications by exploiting the parallelism inherent in the algorithms employed. There are several levels of parallelism to address. A starting point is to structure the HPC application for multi-threaded execution suitable for parallel execution across a grid of processors. This is task-level parallelism, exploited by cluster computing. There are software packages available that can take legacy applications and transform them into a structure suitable for parallel execution.

A second level of parallelism lies at the instruction level. Conventional processors support the simultaneous execution of a limited number of instructions. FPGAs offer deeper pipelining, and therefore can support a larger number of simultaneously executing “in-flight” instructions.

Data parallelism is a third level that FPGAs can exploit. The devices have a fine-grained architecture designed for parallel execution, thus, can be configured to perform a set of operations on a large number of data sets simultaneously. This parallel execution performs the equivalent work of numerous conventional processors all in a single device.

By exploiting all three levels of parallelism, an FPGA operating at 200 MHz can outperform a 3 GHz processor by an order of magnitude or more, while requiring only a quarter of the power. Commonly used signal processing algorithms, such as FFTs (Fast Fourier Transforms) show performance increases of 10X over the fastest CPUs.

Opportunities in Financial Analytics

One of the major markets where computing speed is an extremely important asset is financial analytics. A key application within this market space is the analysis of “derivatives”: financial instruments such as options, futures, forwards, and interest-rate swaps. Derivatives analysis is a critical on-going activity for financial institutions, allowing them to manage pricing, risk hedging, and the identification of arbitrage opportunities. The worldwide derivatives market has tripled in size in the last five years.

The numerical method for derivatives analysis uses Monte Carlo simulations in a Black-Scholes world. The algorithm makes heavy usage of floating-point math operations such as logarithm, exponent, square-root, and division. In addition, these computations must be repeated over millions of iterations. The numerical Black-Scholes solution is typically used within a Monte Carlo simulation, where the value of a derivative is estimated by computing the expected value, or average, of the values from a large number different scenarios, each represent a different market condition.

The key point is the need for “a large number.” Because Monte Carlo simulation is based on the generation of a finite number of realizations using a series of random numbers (to model the movement of key market variables), the value of an option derived in this way will vary each time the simulations are run. The error between the Monte Carlo estimate and the correct option price is of the order of the inverse square root of the number of simulations. To improve the accuracy by a factor of 10, 100 times as many simulations must be performed.

With so many iterations of intense computation required, derivatives analysis is clearly a prime candidate for acceleration of HPC performance. Performance is not the only concern for this application, however. The ideal solution must also address some critical operational issues.

Accurate price estimates are critical for financial institutions. Inaccurate or compromised models create arbitrage opportunities for other players in the market. The algorithms and parameters used by the analysts thus can vary widely for different financial instruments and are constantly being tweaked and refined. For this reason and for reasons of maintainability, financial analysts (“quants”) typically develop their algorithms in a high-level language, such as C, Java, or MATLAB.

Because the accuracy of the analysis represents an edge in the market, a high degree of secrecy shrouds the exact algorithms employed by the quants. Disclosure of the algorithm details could expose billions of dollars to arbitrage risk. In addition, there are regulatory (SEC) requirements for verification and validation of risk-return claims made on financial instruments. It is therefore often not practical or advisable to modify, transform, re-factor, or optimize the application codes in order to speed execution.

Requirements for financial analytics are notably stringent: high-precision, intense math computation with millions of iterations, programmed in a high-level language that should not be re-factored or altered. Can FPGA coprocessors meet these challenges? Absolutely, but only if a complete solution is provided! An FPGA hardware platform, a high-level programming environment, and a library of key FPGA functions are the keys. These are now available as a new tool for accelerating and improving financial analysis.

About the Author

Bryce Mackin is strategic marketing manager for the Altera’s computer and storage business unit focusing on FPGA co-processing for the high performance computing market. In that role he has investigated and developed methods for accelerating HPC applications. Bryce has been in the computer and storage market for over 10 years responsible for both product and technical marketing. Previously he worked for three years in a similar capacity with Xilinx. Prior to that, he was in several product marketing roles at Adaptec Inc. and was also marketing chairman for the Storage Networking Industry Association (SNIA) IP Storage forum, an industry association. Bryce has spent his career focusing on achieving optimum performance for computing and storage applications.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Empowering High-Performance Computing for Artificial Intelligence

April 19, 2024

Artificial intelligence (AI) presents some of the most challenging demands in information technology, especially concerning computing power and data movement. As a result of these challenges, high-performance computing Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that have occurred about once a decade. With this in mind, the ISC Read more…

2024 Winter Classic: Texas Two Step

April 18, 2024

Texas Tech University. Their middle name is ‘tech’, so it’s no surprise that they’ve been fielding not one, but two teams in the last three Winter Classic cluster competitions. Their teams, dubbed Matador and Red Read more…

2024 Winter Classic: The Return of Team Fayetteville

April 18, 2024

Hailing from Fayetteville, NC, Fayetteville State University stayed under the radar in their first Winter Classic competition in 2022. Solid students for sure, but not a lot of HPC experience. All good. They didn’t Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use of Rigetti’s Novera 9-qubit QPU. The approach by a quantum Read more…

2024 Winter Classic: Meet Team Morehouse

April 17, 2024

Morehouse College? The university is well-known for their long list of illustrious graduates, the rigor of their academics, and the quality of the instruction. They were one of the first schools to sign up for the Winter Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that ha Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use o Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announce Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. I Read more…

Google Announces Homegrown ARM-based CPUs 

April 9, 2024

Google sprang a surprise at the ongoing Google Next Cloud conference by introducing its own ARM-based CPU called Axion, which will be offered to customers in it Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Leading Solution Providers

Contributors

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire