Since 1986 - Covering the Fastest Computers in the World and the People Who Run Them

Language Flags
October 18, 2012

Penguin Joins Microserver ARMs Race

Michael Feldman

Penguin Computing has launched its first ARM-based server platform. Known as the UDX1, the Penguin box is based on Calxeda’s latest ARM server chip, and is aimed at cloud computing, Web hosting, and, especially, data analytics – UD stands for Ultimate Data. The move puts Penguin into the front ranks of computer makers who are testing the waters for the burgeoning microserver market.

Although Penguin is best known for its HPC cluster offerings, it also sells into the enterprise space, from which it currently collects half its revenue. With established customers like Digg and Yelp, the company is looking to expand its footprint even further in the commercial arena. One of the ways it intends to do that is via the “big data” market, an application domain that spans genomic sequencing, risk analysis for stock portfolios, retail analytics and everything in between. Conveniently that encompasses the company’s HPC and enterprise customer bases.

The idea behind the UDX1 is to offer a less costly and more energy-efficient platform for these data-intensive applications. In general, x86 Xeon and Opteron servers offer more computational power than needed for applications that tend to be I/O bound. Therefore, rejiggering the compute-I/O balance by cutting back on thread/core performance can, at least in theory, offer a much more efficient solution.

That’s the premise of the microserver architecture, which uses less performant, but much lower power processors, such as ARM SoCs and low-power Intel Xeons and Atoms, to drive these throughput applications. In Penguin’s case, the UDX1 uses Calxeda’s latest EnergyCore ECX-1000 ARM server SoC, a quad-core chip that tops out at 5 watts. Each 4U enclosure houses up to 12 Calxeda modules, each holding four of those SoCs.

Note that the current crop of Calxeda server chips are based on 32-bit ARM, so there is that annoying limitation of a 4 GB memory reach. But for Hadoop-type workloads that can slice up datasets into bite-sized chunks, and scale out appropriately, this is a manageable problem.

Since each ARM chip comprises a complete server node, the UDX1 chassis offers 48 servers, in aggregate, (so 192 cores). Each node can hook into 4GB of DRAM and 36 1GB storage drives. Network switching is provided in the form of an on-chip network fabric supporting 10GbE connectivity between nodes, obviating the need for an external switch. In addition to on-chip Ethernet, the SoC includes integrated controllers for memory, PCIe, and SATA drives, as well as system management logic.

Since each of the servers runs 5 watts at full load, the whole chassis draws only 240 watts. Not bad for 192 cores. Obviously these are not Xeon cores; the ECX-1000 chip tops out at 1.4 GHz, which is less than half the speed of a top-end x86 server CPU. But in its intended space of divide-and-conquer-computing, there are a lot less wasted cycles waiting for I/O to catch up. At just a little over a watt per thread, energy-efficiency is an order of magnitude better than conventional server platforms.

According to Arend Dittmer, Penguin’s director of product marketing, a fully-populated UXD1 chassis will run about $30-35K. He says they already have a trio of orders for the new platform: one from a financial services firm, and the other two from national labs – all for data analytics work. At this point, the systems are being targeted for experimentation, rather than production, as customers kick the tires to see how well the Penguin box works under their analytics loads.

While the volume market for such microservers is going to be in the commercial space, Dittmer sees such systems filling a comfortable niche in HPC shops. He says, for mainstream science computation, where FLOPS are king, this is not the right platform (and doesn’t try to be). But since there is a finite amount of power and real estate in a datacenter, it makes sense to offload the data analytics work of science to more efficient hardware like the UXD1.

Penguin is not the only server maker utilizing Calxeda silicon. UK-based Boston Limited offers a very similar system to the UXD1, which they call Viridis. The Boston box is a 2U chassis that houses up to 48 Calxeda nodes and is aimed at essentially the same application space that Penguin is targeting. According to David Power, Boston’s Head of HPC, they have a 36-bay, 4U platform in the works, based on the same Calxeda SoCs.

Both vendors are already looking ahead to Calxeda’s plans for its 64-bit ARM SoC, which the company has code-named “Lago.” No one has committed to a date, but it’s reasonable to think that these chips should start to appear in the 2014 timeframe, with server implementations to follow shortly thereafter.

By that time, Penguin and Boston should have plenty of company. HP has been flirting with Calxeda for some time with its Project Moonshot development platform, but opted to go with Intel Atom CPUs for its initial microserver line. Dell has been dipping its toes into the microserver space as well, but gave the nod to Marvell’s quad-core Armada XP 78460 chip. IBM has yet to choose sides, but if these initial microserver platforms start to gain traction, you can bet Big Blue will figure out a way to get into the game.