Aspen
Oakridge Top Right
HPCwire

Since 1986 - Covering the Fastest Computers
in the World and the People Who Run Them

Language Flags

Visit additional Tabor Communication Publications

Datanami
Digital Manufacturing Report
HPC in the Cloud
Green Computing Report

Tabor Communications
Corporate Video

Intel, AMD Gear Up for 2011 Server Chip Battle


Although 2010 still has a few months left to go, the competition in the x86 server processor arena for 2011 is already setting up to be a knock-down, drag-out fight. Both AMD and Intel are introducing new high-end server chips with revamped microarchitectures next year, and, at the same time, upping the core counts over their previous generation products. At a time when AMD is looking to make up lost market share, Intel is hoping to expand its dominance in the x86 server market.

This week at the Intel Developer Forum (IDF) extravaganza in San Francisco, Intel had the opportunity to provide some more tidbits about its next generation "Sandy Bridge" server processors, but chose to concentrate mostly on the client-side products and applications. This was a practical choice, given that the chipmaker is planning to launch two of its most interesting products later this year: the new "Tunnel Creek," Atom E600 SoC processors for embedded apps and the first "Sandy Bridge" processors with integrated graphics for PCs.

Sandy Bridge, which represents the 32nm-based microarchitecture upgrade from Nehalem, will end up in Xeon server parts as well, but these chips are not expected to ship until well into 2011. They'll be meeting AMD's 32nm "Interlagos" Opteron CPUs in roughly the same timeframe.

The first Sandy Bridge chips, which Intel talked up at IDF, are destined for desktop and laptop platforms and will sport two or four cores along with an integrated graphics engine. The new design will include a new high bandwidth, low latency "ring" interconnect that enables the integrated graphics unit to share cache with the CPU cores. In general, Intel's CPU-GPU design mimics AMD's Fusion processor architecture, also initially targeted to the PC market.

The idea is to bring at least low-end and mid-range graphics support on-chip, eliminating the need for an external GPU on the motherboard. The integrated graphics is being aimed at a rapidly growing set of applications for client platforms, including HD video, 3D visualization, mainstream gaming, multi-tasking and online socializing and multimedia.

However, despite the growing popularity of GPU computing for technical computing, the next generation of Xeons and Opterons for servers are not going to have integrated graphics. Instead, the extra silicon real estate will be used for CPU cores. In the case of Sandy Bridge Xeons, expect to see up to 8 cores per chip, at least for the dual-socket version. AMD's Interlagos Opteron, meanwhile, will come in 12-core and 16-core flavors.

At IDF, Intel demonstrated a next-generation 8-core Xeon processor (presumably the Sandy Bridge EP, or equivalent) in a two-socket server, referring to it as the "Romley" platform. According to Intel, this was the first public showing for this platform since it booted up last month. Intel went on to say that those chips were on schedule for production in the second half of 2011.

The particular application being demonstrated on Romley was decrypting and encrypting three video conference streams simultaneously. Since they had HyperThreading enabled, the app had 32 threads to play with, which Intel chief Paul Otellini remarked was "pretty amazing" for a two-socket server.

Keep in mind that AMD's upcoming Interlagos chip will also support 32 threads in a two-socket box, but won't need anything like HyperThreading to pull it off. Interlagos is based on AMD's new Bulldozer core architecture, which doubles up on integer units inside a module. Interlagos has 8 Bulldozer modules, thus 16 cores per chip and 32 per 2P server.

AMD's John Fruehe noted that even though the company was moving up to Bulldozer, the Interlagos processors have the same thermal envelope and snap into the same socket (G34) as the previous generation Magny-Cours chip, providing an easy upgrade path for Opteron customers. (Sandy Bridge Xeons will almost certainly require a socket change.) AMD will be sampling Interlagos with their partners before the end of this year and launching it in 2011.

Not surprisingly, Fruehe believes his company has the edge in next year's CPU server battle, mainly because the Opterons will out-core the Xeons in a head-to-head match-up. That's true even today, where the 12-core Magny-Cours chip is dueling with the 6-core Westmere EP and 8-core Nehalem EX.

In AMD's own testing for Linpack performance, a two-socket Magny-Cours server easily outruns a two-socket Nehalem box. And although that benchmark matches up a previous generation quad-core Nehalem with a current generation 12-core Opteron, Fruehe said Magny-Cours would outperform the newer 6-core Westmere processors as well.

In fact, for a company that seemed rather unenthusiastic about multiplying cores just a few years ago, AMD can't seem to get enough of them now. And that seems to reflect customer demand too. According to Fruehe, customers, and especially HPC customers, are selecting systems with the 12-core version of Magny-Cours over the 8-core variant.

The company was anticipating more users would opt for higher clock speeds and a better ratio of cores to memory/cache bandwidth, so would naturally gravitate toward the 8-core version. As it turns, a fair number of mainstream business customer did just that. But in HPC and elsewhere, there is a heavy preference for additional cores over clock speed.

"That bodes well for us as we get into 2011 because core counts go up again, from the 8 and 12 we have today to 12 and 16," said Fruehe. "It really feels like customers are dying for more cores, so that puts us in a real good position as we bring out the Bulldozer products."

That said, the core advantage for AMD's top-of-the-line server chips might not result in better floating point performance compared to their Intel counterparts. Both Sandy Bridge and Bulldozer are supporting expanded 256-bit floating point operations, accessible through new AVX (advanced vector extensions) instructions. The wider vector will allow for up to two times the peak FLOPS throughput. But since each two-core Bulldozer module shares a single 256-bit floating point unit (as an aggregation of two 128-bit units), the Opterons will need twice as many cores to keep up the Xeons when the application is using these extra-wide FP operations.

Since none of these processors, not even the client versions, have been released into the wild yet, no specific performance data is available. AMD is promising a 50 percent better performance on Interlagos compared to Magny-Cours. But that refers to absolute peak throughput; your application mileage will almost certainly vary. Intel has been mum on any performance numbers for Sandy Bridge, other than stating the obvious FP throughput boost for the 256-bit AVX instructions. In any case, 2011 will be here soon enough and we'll let the benchmarkers have at it.

Sponsored Links

Accelerate your science with Seneca
One of the first HPC providers installing a 4X NVIDIA Kepler K-20 cluster. Invites you to a free evaluation on Seneca’s NVIDIA K20 Kepler cluster, pre-loaded with AMBER, NAMD, LAMMPS

High-Performance Computing in Action
Businesses that want to be on the cutting edge of their industries are increasingly turning to high-performance computing (HPC) solutions to handle complex compute processes and speed up their rate of innovation. Download this Executive Brief to see how businesses in energy, life sciences and entertainment put HPC solutions to work in their operations.

Webinar: Programming Heterogeneous X64+GPU Systems Using OpenACC
Join Michael Wolfe as he compares the advantages and costs of using both low-level models and the directive-based OpenACC model for programming accelerated heterogeneous systems. Registration is free.

May 23, 2013

May 22, 2013

May 21, 2013

May 20, 2013

May 17, 2013

May 16, 2013

May 15, 2013

May 14, 2013

May 13, 2013

May 10, 2013


Most Read Features

Most Read Around the Web

Most Read This Just In

Cray CS300-LC

Short Takes

NASA Builds 'Climate in a Box'

May 23, 2013 | he study of climate change is one of those scientific problems where it is almost essential to model the entire Earth to attain accurate results and make worthwhile predictions. In an attempt to make climate science more accessible to smaller research facilities, NASA introduced what they call ‘Climate in a Box,’ a system they note acts as a desktop supercomputer.
Read more...

Building Supercomputers with Raspberries

May 22, 2013 | At some point in the not-too-distant future, building powerful, miniature computing systems will be considered a hobby for high schoolers, just as robotics or even Lego-building are today. That could be made possible through recent advancements made with the Raspberry Pi computers.
Read more...

Running Computational Fluid Dynamics in the Cloud

May 16, 2013 | When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
Read more...

Computing the Physics of Bubbles

May 15, 2013 | Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
Read more...

Internet2 Awards Program Seeks Innovative Applications

May 10, 2013 | Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
Read more...

Sponsored Whitepapers

Best Practices in Big Data Storage

05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.

Progress in Parallel: the Bull Parallel Programming Center

04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.

Sponsored Multimedia

SGI DMF ZeroWatt Disk Solution

In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.

Cray CS300-AC Cluster Supercomputer Air Cooling Technology Video

The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.

SC12 Editorial Feature HPCwire Soundbite sponsored by ISC

HPC Job Bank


Featured Events


  • June 16, 2013 - June 20, 2013
    ISC'13
    Leipzig,
    Germany

  • June 17, 2013 - June 18, 2013
    Forecast 2013
    San Francisco, CA
    United States





HPCwire Events