Continuing our coverage of the new Intel “Haswell” Xeon E5 v3 series chips, we turn our attention to a recent blog post from software acceleration specialists Xcelerit to see how the Haswell fares on a popular Monte-Carlo financial application.
The Haswell E5 family offers significant performance gains over the previous Ivy Bridge processors but to see how this impacts real-world applications requires benchmark testing. The Xcelerit team chose the popular Monte-Carlo LIBOR Swaption Porfolio pricer to do its tests on.
Jörg Lotze, Xcelerit technical lead and co-founder, writes that “from a hardware perspective, the main changes between the two processor generations are the new AVX-2 instructions (including a fused multiply-add instruction), higher memory and cache bandwidths, and more processor cores.”
The table below gives a run-down on the two SKUs used in this experiment: Xeon E5-2697 v2 and the Xeon E5-2697 v3.
As Lotze explains, a Monte-Carlo simulation is used to price a portfolio of LIBOR swaptions, a very common type of financial derivatives.
The same application was executed on the Intel Ivy Bridge and Haswell server CPUs with the following configurations:
+ CPU: 2 sockets, Ivy-Bridge (Xeon E5-2697 v2) & Haswell (Xeon E5-2697 v3)
+ HT: Hyper-threading enabled
+ OS: RedHat Enterprise Linux 6 (64bit)
+ RAM: 64GB
+ Development Tools: Xcelerit SDK 3.0.0a / ICC 15.0
The full application time was recorded by the Xcelerit staff for each run and detailed in the following chart:
As illustrated above, Haswell enabled significant speedups, as high as 2.42x in single precision and 1.63x in double.
“This is in line with the increase in the number of cores from 12 to 14 per chip, and the new fused multiply-add instruction. Many financial applications can benefit significantly from this instruction as multiplications followed by additions are very common. In single precision, a key Ivy-Bridge performance limiter of the test application is the cache bandwidth. The higher bandwidth of Haswell overcomes this limitation, explaining the high single-precision speedups.”