If you’ve been waiting for a formal evaluation of the IBM Power8 architecture on common financial workloads, then look no further. According to results shared at the STAC Summit on June 4, an IBM Power8-based system server delivered more than twice the performance of the best-in-class x86 counterpart when running a set of standard financial industry benchmarks.
Sumit Gupta, vice president of HPC and OpenPOWER operations at IBM, filled us in on the details. The certified STAC report, which was published in March, marked the first time the IBM Power8 architecture has gone through STAC-A2 testing. Developed by the user community, the STAC-A2 benchmark set represents a class of financial risk analytics workloads characterized by Monte Carlo simulation and “Greeks” computations.
Compared to other publicly-released results of warm runs on the Greeks benchmark (STAC-A2.β2.GREEKS.TIME.WARM), the two-socket Power8 server, outfitted with two 12-core 3.52 GHz Power8 processor cards, achieved:
- 2.3x performance over the comparable x86 setup, an Intel white box with two Xeon E5-2699 v3 (Haswell EP) @ 2.30GHz.
- 1.7 times the performance of the best-performing x86 solution, an Intel white box with two Intel Xeon E5-2699 v3 processors (Haswell EP) @ 2.30GHz and one Intel Xeon Phi 7120A coprocessor.
- Only 10 percent less performance than the best-performing solution, a Supermicro server with two 10-core Intel Xeon E5-2690 v2 @ 3.0GHz (Ivy Bridge) and one NVIDIA K80 GPU accelerator.
The Power server also set new records for path scaling (STAC-A2.β2.GREEKS.MAX_PATHS) and asset capacity (STAC-A2.β2.GREEKS.MAX_ASSETS). Compared to the best four-socket x86-based solution — a server comprised of four Xeon E7-4890 v2 (Ivy Bridge EX) parts running at 2.80 Ghz — the Power8 server delivered:
- 2.1 times the throughput.
- 16 percent increase for asset capacity.
STAC’s test system was an IBM Power System S824 server with two 12-core 3.52 GHz POWER8 processor cards, equipped with 1TB of DRAM and running Red Hat Enterprise Linux version 7. The solution stack consisted of the IBM-authored STAC-A2 Pack for Linux on Power Systems (Rev A), which used IBM XL, a suite for C/C++ developers that includes the C++ Compiler and the Mathematical Acceleration Subsystem libraries (MASS), and the Engineering and Scientific Subroutine Library (ESSL).
In a blog post, Gupta writes that “STAC-A2 gives a much more accurate view of the expected performance as compared to micro benchmarks or simple code loops.”
Gupta used the occasion to go over some of the fundamental advantages of Power8. “First every core in a POWER8 can be multithreaded eight ways so you can run 8 threads on a single core, enabling 96 threads on a 12-core CPU,” he told HPCwire. “Application scalability is also very good because of the way the processor is architected,” said Gupta, “and memory bandwidth is much higher, allowing dramatically higher performance on a range of applications. Even single-thread performance can be higher compared to x86.”
In the blog piece, he reiterated how each core of the Power System S824 server is running up to eight simultaneous threads at 3.5GHz, while the Power System S824’s memory bandwidth of 192 GB/s per socket “is almost three times the speed of a typical x86 processor.”
“These factors along with a balanced system structure including a large internal 8MB per core L3 are the primary reasons why financial computing workloads run significantly faster on POWER8-based systems than alternatives,” Gupta concludes.