The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
From the Editor | Main Blog Index
June 18, 2009
Wondering how the new quad-core Intel Nehalem (Xeon 5500 series) and six-Core AMD Istanbul (Opteron 2400 series) stack up against each other on HPC-style codes? The folks at Advanced Clustering Technologies, a company that builds customized HPC clusters from standard components, have been putting the latest high-end x86 silicon through its paces, and have generated some interesting results. Company engineers there ran the High Performance Linpack (HPL) benchmark on comparable Nehalem- and Istanbul-based machines, and reported their findings on the firm's Web site.
Linpack, of course, is an artificial benchmark, but it is a decent measure of peak HPC performance on a given architecture and is the basis of the popular TOP500 list of supercomputers. Benchmarks, in general, are easy to misuse though, so the HPC system buyer has to be aware of their application. (Our friend, Andy Jones, vice-president of HPC at the Numerical Algorithms Group, spells out how to use benchmarking to good effect in Thursday's ZDNet article.) Serious HPC buyers tend to use a variety a benchmarks to make procurement decisions, but Linpack is often the starting point.
For their HPL tests, the engineers at Advanced Clustering Technologies took some pains to match up the systems so as to provide an apples-to-apples comparison of CPUs. According to the post, written by cluster engineer Shane Corder:
All of the testing showed we could achieve the highest performance when using both the Intel Compilers and Intel Math Library -- even on the AMD system -- so these were used ... as the base of our benchmarks. The benchmarks were run on an Opteron 2435 Istanbul system (6 core 2.6GHz processor with 16GB of 800MHz DDR2) and a X5550 Nehalem system (quad core 2.66GHz processor with 12GB of 1333MHz DDR3). An attempt was made to keep the systems identical in every other way.
They did adjust the HPL problem size to compensate for the larger memory capacity on the Nehalem platform, such that the code would approach 100 percent of memory usage on each system.
In a nutshell, Istanbul beat out Nehalem, 99.38 gigaflops to 74.03 gigaflops, respectively. It might not be too surprising that the six-core beat out the quad-core, but since Intel supports two threads per core with its so-called "hyperthreading" technology, one might surmise that Intel has the overall advantage in parallel computation. In practice though, a speed boost from hyperthreading is highly application dependent. According to the engineers at Advanced Clustering Technologies, they actually noticed a decrease in performance when using hyperthreading while running HPL. They told me that Linpack is one of the few codes that does not benefit from this kind of technology.
Nehalem did turn out to be more computationally efficient (HPL peak/theoretical peak), which they attributed to the higher memory bandwidth of DDR3 -- Istanbul uses DDR2 -- and less cache snooping. Users are not usually concerned with such metrics, but it does point to a better system balance in the Intel design.
The more telling metric is price-performance, which the AMD platform won hands down: $35.21/gigaflop for the Istanbul-based system versus $52.33/gigaflop for the Nehalem system. When you're talking teraflops, that difference adds up quickly.
As I mentioned before, the results here are all based on Linpack, so the results won't necessarily reflect real-world HPC codes. It's quite likely that a quad-core Nehalem will outperform the six-core Istanbul on many applications, especially the ones that are memory-constrained or can benefit from Intel's hyperthreading architecture. Advanced Clustering Technologies says it hopes to run more HPC benchmarks in the future and intends to publish the results.
Posted by Michael Feldman - June 18 @ 3:09PM
(Digg, Technorati, more)
There are 1 discussion items posted.
The benchmark is completely wrong.
Submitted by Patrick LEE on 06/18/2009 - 7:14PM
The benchmark results shown on http://www.advancedclustering.com/company-blog/high-performance-linpack-on-xeon-5500-v-opteron-2400.html is clearly incorrect.
The Linpack benchmark result will get better as problem size became larger. As the Istabul 2435 have 4GB more, it runs a larger problem size N. One should take away the 4GB extra memory and both the Linpack benchmark with the problem size on both system.
Conclusion: The benchmark is completely wrong.
Post #1
Platform HPC Workgroup Manager
Platform HPC Workgroup Manager integrates all the cluster productivity tools you need to deploy, run and manage your HPC environment.
Michael Feldman is the editor of HPCwire.
More Michael Feldman
HPC? not so much by ewahl
Re: Podcast: A Trio of HPC Apps by sibat0705
Re: Podcast: A Trio of HPC Apps by sibat0705
Re: Cray Corrals Big Defense Deal by watchesuk
We think by watchesuk
Re: IBM and HPC by truly64
HPC = servers but a lot more by lawries
Lena by Nastyanna
Lena by Nastyanna
Multi core deployment becomes a memory game by truly64
Re: Venture Capital Drought? Not So Much. by Ron Van Holst
Re: AMD Confirms 12-Core Opteron Production by Nastyanna
Re: Cray Corrals Big Defense Deal by Nastyanna
Re: Podcast: Cray Awarded Defense Deal; SGI Makes Storage Buy; IBM Invents New Algorithm by Nastyanna
Painful Truth by jeffrey.mcallister
SGI = graphics + HPC by johnbarr
HPC = servers but a lot more by truly64
Oracle SPARC != Fujitsu SPARC by Alan M. Feldstein
Sun & HPC != Oracle & HPC by Merblich
a third vendor for lossless low latency 10GbE fabric by lee.fisher@hp.com
Response to GAH by KevinButerbaugh
Response to KevinButerbaugh by GAH
Response to KevinButerbaugh by GAH
Response to GAH by KevinButerbaugh
Response to bdrupp by KevinButerbaugh
Climate Crisis and Exaflops by bdrupp
Climate Crisis and Exaflops by John Hules
Climate Crisis and Exaflops by GAH
Climate Crisis by KevinButerbaugh
IBM "Brain Simulation" article is not properly presented. by Merritt
563 out of 1206 by vvolkov
Little Iron by gadunk
At least it's not "cloud" by KevinButerbaugh
Native QPI Interface? by commike
Mmmmmm by hellcats
New transistorized IC chip scales. by symmecon
Itanium at IDF by Alan M. Feldstein
Communication time by jnapper
"The financial meltdown and computing" by donpellegrino
Human Models by mdgabriel
High-End SPARC Chip for Scientific Applications by Alan M. Feldstein
RapidMind by Mr LolO
Rapidmind by dminor
Longer run times by JohnWest
re: Algo trading Angst by jshore
Results of Testing by in_the_crease
Right on schedule, Intel has launched its Xeon 5600 processors, codenamed "Westmere EP." The 5600 represents the 32nm sequel to the Xeon 5500 (Nehalem EP) for dual-socket servers. Intel is touting better performance and energy efficiency, along with new security features, as the big selling points of the new Xeons.
Read More...
The ACM Turing Award goes to the creator of the modern personal computer; and Voltaire announces a mid-range InfiniBand switch and new technology that accelerates distributed applications. We recap those stories and more in our weekly wrapup.
Read More...
The prospects for virtual SMP technology got another boost last month when Florida State University announced it had installed a new HPC system from 3Leaf Systems. The servers are being housed at the university's HPC facility and will be used across a range of scientific disciplines.
Read More...
Mar 16 | Bio-IT World | Biotech firm builds genetic models from patient data. Read more...
Mar 15 | The Register | EMC's grand vision for unified global storage. Read more...
Mar 15 | Data Center Knowledge | Company delivers UCS-container solution to NASA. Read more...
Mar 11 | Linux Magazine | CUDA may be the rage, but OpenCL is a standard that has some features you may need. Read more...
Mar 09 | Free Software Magazine | Data-driven computing will need open software. Read more...
Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.
Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.
Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.
Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.
LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html