Server maker SeaMicro has unveiled the SM10000-XE, a new microserver aimed squarely at the burgeoning ultra-scale datacenter market. The company is best known for pioneering the microserver space using Intel’s power-sipping Atom CPUs, but in this latest offering, SeaMicro has opted for high powered, low-wattage Sandy Bridge Xeons, which expands the application horizons of microservers considerably.
Microservers were originally invented to drastically shrink the power and space associated with large-scale computing. Up until now though, microservers have been powered by relatively low performance processors, such as the Atom CPU and 32-bit ARM chips. By necessity, that meant the application set was limited to light-weight workloads that could be highly parallelized, such as web serving and batch analytics.
With the addition of low-wattage but more performant Xeons into the mix, SeaMicro is looking to expand the microserver business into what it calls “brawny applications.” That includes more traditional enterprise workloads like Java, PHP, MemCacheD, and NoSQL, as well as web-based database processing. SeaMicro CEO Andrew Feldman characterized the new Xeon-powered SM10000-XE as the “the mainstreaming of the microserver.”
The SM10000-XE, which lists for $138,000, is a chassis that houses 64 single-socket compute nodes. Each node consists of a 45W quad-core Xeon (E3-1260L) and up to 32 GB of Samsung’s power-efficient DRAM (1.35V, 30nm process technology). A SATA slot is available for an optional hard disk or SSD and Ethernet uplinks of either the GigE or 10GigE variety are available to connect the box to the outside world.
The microserver nodes are strung together in a 3D torus with SeaMicro’s high bandwidth, low latency “Freedom Supercompute Fabric.” It provide a whopping 10 GigE bandwidth to each socket — 1.28 terabits across the whole chassis. As such, it replaces around 1,000 GigE switches, which saves hundreds of thousands of dollars in up-front cost, as well as substantial energy costs over the system’s lifetime.
The 64-node chassis fits in a 10U form factor and draws a modest 3.5 KW. According to Feldman that’s about three times the density and one half the power of competing x86 solutions. And thanks to the interprocessor fabric, the CPUs have access to 12 times the external bandwidth of a conventional server. Feldman says 20 of these SM10000-XE chassis have enough computational muscle to run Amazon’s entire web e-retail business. “This is quite simply the most efficient Xeon server ever built,” he claims.
Such density is achieved with the help of SeaMicro’s own Freedom ASIC, the technology that distinguishes the company’s microserver from its competitors. The ASIC encapsulates not only the Freedom fabric interconnect, but also I/O virtualization logic which SeaMicro says replaces 90 percent of the motherboard components, including external I/O and network interface chips. Also included on the ASIC is something called TIO (Turn It Off), which can shut down unused logic blocks on the CPU, further reducing the power draw.
Because of all this consolidation, only three components remain on the motherboard: the CPU (plus CPU chipset), the DRAM chips, and SeaMicro’s ASIC. The entire chipset fits onto an 11-by-5.5-inch card, but that doesn’t include a SATA drive or SSD if the customer opts for such storage.
Although Feldman claims that the SM10000-XE will propel the microserver into “every nook and cranny of the scale-out datacenter,” at no time did he mention high performance computing, an application area that also becoming space and power limited. But many embarrassing parallel applications, scientific or otherwise, are actually well suited to this architecture. That’s assuming the code can be sliced up in such a way that its memory requirements per node don’t exceed the relatively modest 32GB limit. Unfortunately, there is no cache coherency across nodes.
Applications fitting this profile would be things like genomic analysis, certain types of seismic analysis, large-scale image rendering, and all sorts of scientific data mining. The fact that the low latency Freedom fabric can feed each CPU with 10GigE (2.5 gigabits/second per core) suggests MPI-based applications should fare rather well on this architecture.
Keep in mind that the low-wattage Xeon E3-1260L used in the SM10000-XE provides quite respectable performance. Since the chip is part of the Sandy Bridge family, it supports the new AVX floating point instructions, which means each core can execute 8 double precision FP instructions per clock cycle. So at 2.4GHz, the quad-core E3-1260L delivers a peak performance of 76.8 gigaflops (4.9 teraflops for the entire chassis). That works out to about 1400 megaflops/watt, which could place an SM10000-XE system in the top ten of the latest Green500 list.
The nice thing about the SeaMicro fabric and I/O virtualization technology is that it is designed to be chip agnostic. The ease which the company can do that is enabled by hooking the fabric into the standard PCIe interface on the host processor. If other low-power processors come along (think 64-bit ARM), SeaMicro should be able to build microservers around those chips in fairly short order.
Because the Xeon is the mainstream chip in commercial clusters today, SeaMicro intends to sell more of these boxes than they did with their Atom-based offerings. Even without the SM10000-XE though, the company has been doing “phenomenal,” according to Feldman. Although he didn’t offer how much revenue his company collected during their first year of business (2011), Feldman says it was more than the combined sales of Riverbed, 3PAR, Aruba Networks and Data Domain combined during their first year. “It looks pretty bright out there right now,” he says.