Since 1986 - Covering the Fastest Computers in the World and the People Who Run Them

Language Flags
March 5, 2013

Intel Xeon Phi Versus ‘Sandy Bridge’

Tiffany Trader

The Intel Xeon Phi has drawn comparisons to its accelerator-class brethren from NVIDIA (Kepler) and AMD (FirePro), but how does the Phi coprocessor measure up to its Xeon “Sandy Bridge” brand-mate? That is the topic of a recent blog from Xcelerit Senior Solution Architect Paul Sutton. The Phi coprocessor is tested against a pair of “Sandy Bridge” E5-2670 server processors, using the Monte-Carlo LIBOR Swaption Portfolio Pricing application as the benchmark.

Sutton starts with a rundown of the pertinent Xeon Phi 5110P stats. This x86 architecture manycore processor has 60 cores with 4x hyperthreading for a total of 240 logical cores. The chip boasts a peak performance of one teraflop (double-precision).

The benchmark algorithm comes from the world of quantitative finance. It’s a Monte-Carlo simulation that is used to price a portfolio of LIBOR swaptions (financial swap contracts). Sutton explains that “thousands of possible future development paths for the LIBOR interest rate are simulated using normally-distributed random numbers.” Each development path represents one Monte-Carlo path.

The test is performed on an HP ProLiant SL250 server configured with 2 Intel Xeon E5-2670 processors (with 8 cores each and hyperthreading disabled) and the Intel Xeon Phi 5110P coprocessor. The server has 64GB of RAM, runs Red Hat Enterprise Linux 6.2 (64 bit) and Intel Composer XE 2013.

The benchmark compares the performance of two Xeon E5-2670 processors to a single Xeon Phi. The application is run once on the two Sandy Bridge host CPUs (multi-threaded) and then again on the Xeon Phi co-processor in offload mode, where the main executable runs on the host CPU and the Monte-Carlo computation is handled by the Phi chip.

Execution times are measured with respect to the target processors, and the results are recorded. A chart depicts the Phi to Sandy Bridge speedup for both single and double precision performance.

At 100k paths, the Intel Xeon Phi begins to surpass the performance of the two Sandy Bridge CPUs. At one million paths, the Phi is three times faster than the pair of E5s. Sutton observes that the slower Phi performance at lower numbers can be explained by “the added data transfers and the comparably low level of parallelism for a low number of paths (considering both vectorization and multi-threading).”

Interestingly, the speedup is more pronounced using double-precision performance. For example, at 128K paths, single-precision puts Phi at 1.05x faster, and double-precisions puts Phi at 1.24x faster.

SC14 Virtual Booth Tours

AMD SC14 video AMD Virtual Booth Tour @ SC14
Click to Play Video
Cray SC14 video Cray Virtual Booth Tour @ SC14
Click to Play Video
Datasite SC14 video DataSite and RedLine @ SC14
Click to Play Video
HP SC14 video HP Virtual Booth Tour @ SC14
Click to Play Video
IBM DCS3860 and Elastic Storage @ SC14 video IBM DCS3860 and Elastic Storage @ SC14
Click to Play Video
IBM Flash Storage
@ SC14 video IBM Flash Storage @ SC14  
Click to Play Video
IBM Platform @ SC14 video IBM Platform @ SC14
Click to Play Video
IBM Power Big Data SC14 video IBM Power Big Data @ SC14
Click to Play Video
Intel SC14 video Intel Virtual Booth Tour @ SC14
Click to Play Video
Lenovo SC14 video Lenovo Virtual Booth Tour @ SC14
Click to Play Video
Mellanox SC14 video Mellanox Virtual Booth Tour @ SC14
Click to Play Video
Panasas SC14 video Panasas Virtual Booth Tour @ SC14
Click to Play Video
Quanta SC14 video Quanta Virtual Booth Tour @ SC14
Click to Play Video
Seagate SC14 video Seagate Virtual Booth Tour @ SC14
Click to Play Video
Supermicro SC14 video Supermicro Virtual Booth Tour @ SC14
Click to Play Video