The x86 CPU festivities are over for now, but the party’s just getting started. The debut of the latest Intel Xeon and AMD Opteron processors over the last few weeks marks something of a turning point for server makers. For one thing, the introduction of Intel’s 6-core Westmere EP and 8-core Nehalem EX CPUs, and AMD’s 12-core Magny-Cours processor marks the beginning of the end of the quad-core era. Given that, HPC servers with fewer than double-digit core counts will soon be the exception rather than the rule.
AMD and Intel are attacking the high-server space somewhat differently, though. With Westmere EP (now the Xeon 5600), Intel is continuing its traditional 2P server business. But with Nehalem EX (now the Xeon 7500), Intel is charting new territory — big shared memory SMP machines. Intel also introduced the Xeon 6500, a 2P-only variant of Nehalem EX, ostensibly aimed at the HPC market. Meanwhile AMD has consolidated its performance-oriented 2P and 4P products into a single Opteron 6000 product line, starting with Magny-Cours (now the Opteron 6100).
From a price-performance perspective AMD has a good story. With the 6100 Opterons, AMD will go head-to-head with the 2P 5600 Xeons, which have faster cores, but fewer of them. The mid-range Opteron 6174, which sports 12 cores and runs at 2.2 GHz, costs $1,165 in quantity. A Xeon with comparable performance is the 6-core X5680, which is clocked at 3.33 GHz and costs $1664. Although the individual Xeon cores run faster, for many types of parallel workloads, the additional six cores on the Opteron will make up the difference, and then some. The fact that the Intel architecture implements HyperThreading, which handles two threads per core, only boosts performance by 10 to 20 percent. And in some cases, such as Linpack, it doesn’t help at all. Since the 6100 Opterons have four channels of memory and support up to 12 DIMM slots per socket, compared to three channels and nine DIMMs for the Xeon 5600s, the AMD CPUs have an additional advantage on memory-loving apps.
The 6100 Opterons will also go up against the 6500 Xeons in the 2P arena, as well as the 7500 Xeons in the 4P space. Here the Xeons go up to eight cores, the memory channel differential has been equalized at four apiece, and the memory capacity advantage is now with Intel at 16 DIMMs per socket. But the EX-class parts are even more expensive than Xeon 5600 chips. For example, the 8-core 6500 and 7500 products cost between $2,461 and $3,692, which is more than two and three times the price, respectively, of the Opteron 6174 mentioned above. Even the least expensive 6-core EX, which is the 1.86 GHz Xeon x7530, costs $200 more than the 6174.
The bottom line is that the new Magny-Cours processors look like a very competitive solution for 2P and 4P servers. But the 4P story is particularly interesting. AMD is pushing this 6000 series as a platform that does away with the “4P tax.” The tax refers to the traditional premium vendors charged for CPUs and chipsets that support 4-socket servers. Since the 6000 hardware can be used in both 2P and 4P boxes, you can actually save money by consolidating dual-socket servers (as long as you don’t need to spread out the processors over more boxes to get at more I/O). “The only reason 4P processors have been priced like they have is because there’s a guy in the business who owns a large chunk of the market and has been pricing that way for 20 years,” says John Fruehe, who heads AMD’s Product Marketing of the Server and Workstation Division. “It’s more tradition than technology that has forced that price.”
That “guy,” that Fruehe is referring to is, of course, Intel. But prior to Magny-Cours, AMD also priced its 4P/8P Opteron 8000 CPUs at a premium in relation to its 2P Opteron 2000 parts. But according to him, they eventually came to the conclusion that the demand for 4P servers was being inhibited by this pricing model. In fact, according to Fruehe, the quad-socket Opteron-based supercomputers on the TOP500 list came about because AMD gave the system vendors a nice volume discount on Opteron 8000 CPUs. “Generally speaking those were deals where an 8000 processor was priced like a 2000,” he told me. “Suddenly the economics made sense.”
Although he wouldn’t point to any specific systems, the half-petaflop “Ranger” Sun Constellation cluster at TACC, which uses quad-socket Opteron-based blades, almost certainly fits in this category. Fruehe maintains AMD still turned a profit on these supercomputer deals, but it gave them the idea that it could move a lot more product by pricing 4P parts like 2P parts. They believe that this strategy will unleash this market in HPC and across enterprise computing.
On the other hand, AMD has decided leave the 8P (and above), at least for the time being. At 60K or so processors per year, the company has calculated this is too small a market to give special consideration to. One might ask, though: If the 4P servers are such a good idea, why not 8P, 16P and so on? As you keep adding processors, or cores for that matter, memory bandwidth and capacity become the limiting factor. As AMD and Intel keep pouring on the cores, they’re forced to rebalance the memory subsystem.
The idea behind the new Xeon 7500 line is to max out both compute and memory in a familiar x86 package. As of this week, OEMs can build 8-socket commodity boxes with 1 TB of memory. With this approach, not only does Intel think it can edge out proprietary RISC CPUs in SMP servers used for mission-critical computing, it also believes it can grow the SMP market overall.
According to David Kanter at Real World Technologies, that might indeed come to pass. Although in the past there were multiple reasons that 8P servers represented a specialty market, a confluence of commodity technologies, including the new Xeons themselves, are changing the economics. In a recent article, Kanter writes:
The primary barriers to adoption for large x86 servers are software, maturity and cost/benefit. Scalable applications that would benefit from 8S servers are not common. Some classic examples include I/O heavy workloads like ERP, transactional or analytic databases and also select HPC workloads that favor shared memory rather than message passing. More recently, server consolidation using virtualization has emerged as an important workload. In 2010, there are simply more scalable workloads than were previously available.
Kanter goes on to analyze how the different pieces of the enterprise ecosystem are evolving, and how they could favor a shift to commodity 8P servers. For now, AMD seems content to play it conservative and let Intel test the SMP waters. If successful, perhaps the junior member of the x86 franchise will jump in after Intel has built the market. In the meantime, AMD is focused on rebuilding its server mojo in the 2P and 4P sweet spots. Magny-Cours looks like a fine start.