As expected, Intel dominated the IT news cycle this week with its semi-annual developer forum (IDF). The company’s upcoming Nehalem processor family was the star of the show, but Intel talked about everything from parallel programming to the next-generation Internet. Here, I’m going to confine my remarks mostly to HPC-related topics and leave the descriptions of desktop processors and mobile computing to the PC rags.
When Intel senior VP, Pat Gelsinger took the stage at IDF on Tuesday, he started with the company’s grand vision of ubiquitous computing. Apparently, Intel’s version of utopia involves billions of embedded computing devices, which are part of a future super-network. In many cases, these devices will be invisible — built into our clothes, houses, cars, and so on. “This, we believe, is a powerful aspect of the Internet of the future,” said Gelsinger, “the embedded Internet where every human on the planet is connected to the Internet 7-by-24 in every modality of life — how they work, how they play, how they learn and even when they rest.”
Well, I’m already experiencing the 7-by-24 connection part, and I’ve got to tell you, I’m not getting the utopia vibe yet. But I think Intel has more advanced applications in mind. All of these devices will supposedly help run basic infrastructure: highways, power grids, health networks, etc. With the billions of chips required to run this uber-Internet, Gelsinger notes that “Intel believes this is a great market opportunity for us.” No kidding.
Now to more immediate concerns. Before delving into the Nehalem story, Gelsinger briefly mentioned that Tukwila, the quad-core Itanium, is on track to ship by the end of the year to OEMs, with systems showing up in 2009. He also talked a little bit about the six-core Dunnington, the last chip of the Penryn processor family. This one is aimed mostly at ERP and transaction processing in enterprise datacenters, and will be shipped next month.
But Nehalem was the big news at IDF. Well not exactly big news — Intel’s been singing the praises of Nehalem for about a year now. But with the chips set to start shipping in Q4, this is the company’s last chance to increase the buzz.
The Nehalem processor family is being characterized as the biggest in x86 architecture in a decade. Gone are the front side bus and off-chip memory controller. Nehalem processors will include the new high performance QuickPath interconnect (also in Tukwila) and integrated controllers, in a design very reminiscent of AMD’s HyperTransport setup.
With the first Nehalem chips so close to delivery, the company revealed a few more specifics on the processor architecture. Besides the new QuickPath technology, Nehalem will use DDR3 memory, which supports three channels and three DIMMS per channel. Level 3 cache has also been added to deal with extra data bandwidth and the latency requirements of more cores — up to eight per CPU. To realize some power savings, Intel ditched the traditional six-transistor design on the cache SRAM and went with eight transistors.
One of biggest changes that Nehalem brings is a new power management system, and a new feature, called turbo mode. According to Intel Fellow Rajesh Kumar, the power management is accomplished by a million-transistor power controller module, which optimizes energy efficiency across the cores.
“Well, the key idea in power management is actually quite simple, [it] is to shut things off when they are not in use,” said Kumar. “And we’ve been doing that for quite some time with something we call ‘power gate.’ The issue is that power gate only removes switching power. [It doesn’t] take care of leakage power, which has become a dominant source of power in modern process technologies. So, that’s what we have done. We have now invented something which can take care of all power. When things are not being used, power goes to zero.”
The other cool part of the technology is when the system detects inactive cores, it shuts them off and cranks up the voltages and frequency on the active cores to allow them to run faster. The new scheme basically eliminates the power penalty for unused cores when the application doesn’t have enough threads to go around. This level of power management was bound to arrive eventually since unused cores would be too much of a power burden as core counts continued to grow. It’s not clear how much of this is under OS control, but the idea is that at the application level, it’s all transparent.
Since the first Nehalems are getting released into the wild in Q4 2008, Intel let loose with some of their impending delivery schedules. In addition to releasing the Nehalem desktop chips (Core i7) in the next month or so, it looks like Intel plans to ship quad-core processors for two-socket servers in Q4. The latter processors are code-named Nehalem-EP and are geared for workstations and HPC servers.
Another server chip, named Nehalem-EX, is scheduled for the second half of 2009. The EX can sport up to eight cores and is meant for four-socket systems. Given that Nehalem supports two threads per core, that means a Nehalem-EX box could run 64 threads in SMP fashion. Intel is targeting the EX for what it calls “expandable servers,” which might just be Intel’s euphemism for “fat nodes.”
The competition between AMD and Intel for quad-core server mindshare could heat up this fall. Last week, AMD announced it’s aiming for a fourth quarter release of its “Shanghai” processors, the 45nm follow-on to the ill-fated Barcelona. Since Nehalem mimics AMD’s HyperTransport and integrated memory controller design, the two x86 architectures have never been so closely aligned. So if everyone’s schedule holds, that means in late 2008 or early 2009, we could see a match-up between Shanghai and Nehalem-EP running on dual-socket servers running HPC benchmarks.
Not everything at IDF was about hardware though. Intel also announced its Parallel Studio, a plug-in to Microsoft’s Visual Studio for multicore/manycore programming. Our intrepid reporter, John West, gives us the scoop about the new offering in this week’s feature article.