With OpenPOWER activity ramping up and IBM’s prominent role in the upcoming DOE machines Summit and Sierra, it’s a good time to look at how the IBM POWER CPU stacks up against the x86 Xeon Haswell CPU from Intel.
Citing the compute-intensive nature of today’s financial applications, the acceleration specialists at Xcelerit sought to compare IBM’s powerhouse with the Xeon Haswell on a popular financial workload. While the “Haswell” touts about twice the maximum performance of the POWER8 chip, putting chips through their real-world paces is a useful exercise that can reveal new insights.
The test was carried out using Intel’s Xeon E5-2698 v3 (Haswell) and IBM’s ISeries 8286-42A (POWER8) processors in the following system configuration:
CPU: 2x Intel Xeon E5-2698 v3 (HT on) and 2 x POWER8 8286-42A (SMT on)
OS: RedHat Enterprise Linux 6.6 and Ubuntu 14.10
Compiler: Intel Compiler 15.0 and IBM Compiler 13.1
The test application, well-known in the financial industry, uses Monte-Carlo simulation to price a portfolio of LIBOR swaptions and simultaneously calculate the sensitivity of option prices to initial forward rates. These first order derivatives, known as the “Greeks,” are essential for hedging and risk analysis. Thousands of possible paths are simulated and then the price and the Greeks are averaged across all paths to obtain the final results (mean price and mean Greeks).
The engineering team compiled the application “with maximum optimization settings, fast math mode, and tuning for the target processor architecture.” They noted that the generated code leveraged low-level processor features, including vector extensions, fused multiply-add instructions and cache optimizations.
Tasked with simulating 15 trades and 80 forward LIBOR rates, each test machine calculated the portfolio’s price along with 80 Greek values. The engineering team recorded total runtimes, excluding the random number generation step, for both single and double precision modes.
This chart from Xcelerit plots the relative performance of Haswell versus POWER8:
Intel CPUs performed up to 3.7x faster at single precision, however this disparity dropped to just 1.5x for double-precision, the more common requirement for financial computing.
The team addresses the results, noting: “The rather large performance difference between single and double precision modes can be explained by the latency-bound nature of this application on Haswell. In single precision, twice the data can fit into the caches. This results in a higher utilization of Haswell’s vector units, hence the greater performance.”
Update: If this micro-benchmark test from Xcelerit left you wanting a more comprehensive evaluation of the Power architecture, the industry standard STAC benchmark shows Power8 outperforming x86 on a number of important financial workloads. Read more…