January 27, 2009

AMD Flexes Its Quads

Michael Feldman

As promised, AMD has added a raft of new 45nm quad-core “Shanghai” Opterons to its product line.
The new chips include five energy-sipping HE processors, with speeds ranging from 2.1 to 2.3 GHz, and which draw just 55 watts. The company also released two new high performance SE processors at 2.8 GHz, consuming 105 watts. The new chips come in both two-socket and multi-socket (four- and eight-socket) flavors and can be plugged into existing motherboards with just a BIOS upgrade.
This is all pretty much standard operating procedure for AMD chip rollouts: launch a new architecture with mid-range processors and then backfill the line with low-power and high performance parts. Typically the SE chips are geared towards high performance apps, but according to Brent Kerby, senior product manager for the Opteron, that’s not always the case. “We’re seeing more and more of these HPC type environments driving towards the lower power bands,” he told me. For example, the Chinese Dawning 5000A supercomputer is built using low-power Opteron parts. Overall, the SE product band represents a pretty small portion of AMD’s total Opteron unit sales, says Kerby — only about five percent.
In any case, AMD’s main push in the server processor space these days is focused on power efficiency. The five new HE Opterons offer a range of power-performance tradeoffs at 55 watts ACP. For extra power savings, AMD has extended its advanced clock gating technology — aka CoolCore — to the L3 cache. All the new models plus all the standard chips from the initial Shanghai launch will now have this enabled via a new BIOS feature. The way it works is that the hardware logic automatically shuts down megabyte chunks of L3 cache that aren’t being used. Given that Shanghai devotes a good chunk of the die to its 6MB L3 cache, the power savings could be significant.
On that same green theme, AMD has developed something it calls PowerCap. It’s a BIOS selectable option that lets the IT manager or system admin set a ceiling on the speed and voltage of the processor. The theory is that a lot of workloads run fine at moderate speeds, and only occasionally need the top frequency of the processor. But putting the pedal to the metal for even short periods draws a disproportionate amount of power, so enforcing a speed limit saves energy with little loss of performance. A single step down from the default clock speed can yield a 30 percent power savings, while the most conservative setting can provide a 65 percent savings.
Maybe all this power pinching is not for the traditional high performance computing crowd. The big targets for low-power CPUs are cloud computing, Web hosting and other mainstream datacenter applications. Even so, HPC workloads that are highly scalable, core-wise, can often take advantage of setups with larger numbers of slower processors to get the needed performance. Of course, as systems grow to tens of thousands of cores and beyond, energy-efficient computing is no longer optional.