The Shape of Chips to Come
In information technology, there’s really no such thing as a product unveiling anymore. Preparing the market for new hardware or software starts way before the products are rolled out. It’s a drawn-out process that begins with PowerPoint presentations and continues up to the point of commercial release. I call this process “unveilation” (literally, the process of unveiling). It can take years, and often does. I’m not saying this is a bad thing. Given the complexity of technology, it’s almost a necessity.
Nowhere was the process more evident than at this week’s Intel Developer Forum (IDF) in San Francisco, where company execs extolled the virtues of chips yet born. There they talked up a number of Intel’s upcoming microprocessors, all in various stages of unveilation. Of particular interest to the HPC crowd was the first demo of the GPU-ish Larrabee chip, an update on Nehalem-EX, a refinement of the Westmere roadmap, and the invention of ultra-low-power Xeons for a new “microserver” category.
First up is Larrabee, a chip that is in the midst of an extended unveilation. After first floating the idea of a high performance CPU-GPU hybrid chip back in 2007, Intel finally gave Larrabee followers their first demo of the silicon in action. Although Intel insists that the initial product line is strictly geared for traditional graphics and visualization apps, I’m convinced that later versions, or derivatives thereof, are being groomed for general-purpose HPC. The first products are expected to hit the streets sometime next year.
I’ve embedded the Larrabee demo below, showing how the chip manages a real-time ray-tracing application.
Almost at the end of its unveilation is Nehalem-EX, the Xeon that will go into servers with four, eight, or more sockets. It’s really the first time Intel will have a competitive multi-socket (i.e., more than two sockets) offering for x86 servers. Nehalem-EX will support a number of RAS features, including Machine Check Architecture (MCA) recovery, which allows the CPU to right itself after encountering certain kinds of system errors. The chip is expected to go into production later this year.
Speaking about Nehalem-EX, Sean Maloney, executive VP and GM of the Intel Architecture Group, said they currently know of over 15 eight-socket-plus designs from eight different OEMs. Some of these are certainly destined for HPC duty. Even a relatively modest four-socket machine will support up to 64 threads and a terabyte of memory. A couple of these four-socket systems have already been announced: one, the IBM BladeCenter EX; the other, a Supermicro 1U box, specifically targeted at HPC. To hammer home the HPC theme, Maloney pointed to a quote from Mark Seager, who leads the advanced computing group at Lawrence Livermore National Lab: “Nehalem EX represents a new SMP on a chip super-node that can help us improve our predictive science and simulation capabilities without having to invest in a vast rewrite of our applications.”
Meanwhile, Westmere, the 32 nm shrink of the Nehalem microarchitecture, is apparently running a little ahead of schedule. The first Xeon implementation (for dual-socket servers), Westmere-EP, is poised for release in the first half of 2010. And in late 2010, Westmere-EX will take the hand-off from Nehalem-EX for multi-socket platforms. Due to the process shrink from 45 to 32 nanometers, lower power consumption and/or faster clocks are in the offing, although no specific numbers were forthcoming at IDF. For the security-minded, Intel has added Advanced Encryption Standard (AES) instructions to enable faster encryption and decryption.
One of the most interesting announcements had to do with the entry-level Xeon 3400 processors. Low-power variants of these chips have been developed for what Intel is calling “microservers” — essentially mini-blades, that take up much less space and use much less power than standard hardware. The chipmaker has come up with a reference design that packs 16 hot-swappable microserver modules into a 5U rack. Intel is planning to release a 45-watt version of the 3400 later this year and a 30-watt model in early 2010.
The idea, of course, is to be able to build extremely dense machines that are inexpensive to both buy and run. The big target market is large-scale and “containerized” datacenters, where power consumption and floor space are enemies number one and two, respectively.
If the performance per watt numbers prove out for technical computing apps, these low-power 3400s could make their way into HPC. SGI will almost certainly make these Intel parts available in its newly announced Octane III personal super and in its CloudRack product line. Other HPC OEMs may follow suit.
Keep in mind that these single-socket 3400 chips are the antithesis of the multi-socket EX Xeon processors. But sometimes scaling out is much more preferable than scaling up. (There are a number of HPC system architects who think hyperscale designs using extremely low-power CPUs is the way to go if exascale computing is to be made practical.) In any case, Intel is making sure it is covering all its bases, and is willing to let the applications decide which computing model fits best.