When designing a high performance computing installation, two questions come to mind: How much processing power will I get and how big of a footprint will it take to achieve that power. One component that can vastly change the outcome of those two questions is the type of processor used for the install and how many cores can be placed on that processor.
Just earlier this month, Virginia Tech unveiled their newest supercomputer dubbed “HokieSpeed,” a 209-node installation that ranked number 96 on the latest TOP500 list. The most interesting detail about HokieSpeed is that it is 22 times faster and 75 percent smaller than VT’s previous low-power supercomputer, known as “System X.”
VT was able to achieve these advances in power and footprint through sheer processing density, using two six-core Xeon processors along with a 2,448-core Tesla GPU in each node. That comes out to a total of 2,508 CPU and 187,264 GPU cores, compared to System X’s 2,200 PowerPC cores. Something else HokieSpeed can be proud of: it landed number 11 on the most recent Green500 list
It seems that the case for future evolution in processing power must incorporate core density. Right now, companies like Tilera and Kalray are working on vastly improving the current processor core count. Tilera calls its new processor family the “TILE-Gx,” with processors ranging anywhere from 16 to 100 cores, which interconnect using a high performance on-chip network.
According to Tilera, each core (tile) is a fully-featured, cache-coherent processor that can run an OS. Still keeping the focus on reduced footprint, the TILE-Gx integrates a set of memory and I/O controllers, reducing the need for North and South bridges, which in turn reduces system board size requirements.
Kalray’s new processor, the Multi-Purpose Processor Array (MPPA), is initially aimed at the embedded computing space. It uses 16-core clusters that are also interconnected through an on-chip network. This family of processors have as few as 16, and as many as 64, built-in 16-core clusters. According to Kalray, the top-of-the-line 1,024-core processor can deliver more than two teraops of performance.
Processor memory amounts to 16 MB per 256 core offering. Finally these processors also integrate two DDR3 memory controllers, two 40Gb/s Ethernet controllers, two 8x PCIe Gen 3 interfaces, GPIO and four 4 to 8 lane Interlaken interfaces used for multi-MPAA chip integration.
In the future, with this amount of core density, a supercomputer with dual-processor nodes, like the HokieSpeed, could potentially have up to 428,032 CPU cores, for the same footprint that was one quarter the size of System X.