The Securities Technology Analysis Center (STAC) issued a report Friday comparing the performance of Intel’s Cascade Lake processors with previous-gen Skylake under the STAC-A2 benchmarking suite, used by the financial services industry to test and evaluate computing platforms. In comparing benchmarking results of the 28-core Xeon Platinum 8280 (Cascade Lake) processors and the 28-core Xeon Platinum 8180 (Skylake) processors, the Cascade Lake solution computed Greeks 23-84 percent faster.
In results highlighted by Intel and certified by STAC, the Cascade Lake solution:
• Had 32% higher throughput and space efficiency (STAC-A2.β2.HPORTFOLIO.SPEED and STAC-A2.β2.HPORTFOLIO.SPACE_EFF, respectively)
• Was 41% faster in warm runs of the large problem size (STAC-A2.β2.GREEKS.10-100k-1260TIME.WARM)
• Was 84% faster in cold runs and 23% faster in warm runs of the baseline problem size (STAC-A2.β2.GREEKS.TIME.COLD and STAC-A2.β2.GREEKS.TIME.WARM, respectively)
• Was able to handle 16% more assets in the max assets test (105 assets vs 90 assets) (STAC-A2.β2.GREEKS.MAX_ASSETS)
Both the Cascade Lake and the Skylake servers had patches applied for the purpose of mitigating Spectre and Meltdown security threats.
Key elements of the Cascade Lake test machine:
• Intel Parallel Studio XE 2018 update 3
• Red Hat Enterprise Linux Server 7.6
• 2 x Intel Xeon Platinum 8280 processor @ 2.70GHz
• 768GB DRAM: 12 x 64GB DDR4 @ 2933 MHz
• Intel Purley 2S Software Development Platform
Key elements of the Skylake test machine:
• Intel Parallel Studio XE 2018 update 3
• Red Hat Enterprise Linux Server 7.6
• 2 x Intel Xeon Platinum 8180 processor @ 2.50GHz
• 192GB DRAM: 12 x 16GB DDR4 DIMMs @ 2666MHz
• Intel Purley 2S Software Development Platform
While the Cascade Lake server has considerably more DRAM, Intel commented, “We do not believe the fact that the Cascade Lake system had more memory than the Skylake system contributed materially to the performance delta between the solutions. The workloads in these benchmarks were not bottlenecked on memory capacity.”
You can see how the same Skylake hardware performed without the Meltdown/Spectre patches in Sept. 2017 testing (SUT ID: INTC170920), but with software changes over the last 19 months, it’s not an apples-to-apples comparison (the newer test stack performs better). The April 2019 benchmark implementation (SUT ID: INTC190401) used Intel Parallel Studio XE 2018 Update 3, which includes Intel Threading Building Blocks 2018 Update 3, Intel MKL 2018 Update 3, and Intel MPI 2018 Update 3.
According to STAC, the STAC-A2 is the technology benchmark standard based on financial market risk analysis. “Designed by quants and technologists from some of the world’s largest banks, STAC-A2 reports the performance, scaling, quality, and resource efficiency of any technology stack that is able to handle the workload (Monte Carlo estimation of Heston-based Greeks for a path-dependent, multi-asset option with early exercise).”
The baseline benchmark (STAC-A2.β2.GREEKS.TIME) computes Greeks with five assets, 25,000 paths, and 252 timesteps — that’s one for each trading day over the course of one year. The test is executed five times, resulting in one cold run and four warm runs. As STAC literature explains, “a cold run simulates a deployment situation in which a risk engine starts up in response to a request [while, a] warm run simulates a case in which an engine is already running, with sufficient memory allocated to handle the request.” Another way to look at this is that the cold run stresses the entire application (including initialization and memory allocation) while a warm run relates to the computationally intensive portion of the application.
The full report, which delivers nearly 200 results in total, is accessible to STAC members from this link.