Visit additional Tabor Communication Publications
April 12, 2012
An announcement from MIT discusses research that proposes to replace the traditional communication bus on processors with an on-chip network. The report explains why such an arrangement is much better for multicore, and especially manycore, architectures:
Today, a typical chip might have six or eight cores, all communicating with each other over a single bundle of wires, called a bus. With a bus, however, only one pair of cores can talk at a time, which would be a serious limitation in chips with hundreds or even thousands of cores.
Li-Shiuan Peh, an associate professor of electrical engineering and computer science at MIT, delivered more dismal news about the scalability of the bus architecture. Her research shows that this architecture only scales to around 8 cores, pointing to many 10-core chips that utilize a second bus. She explains the loss of efficiency is related to the fact the buses consume a lot of power, because they have to drive data across long wires to lots of cores at the same time.
Last summer, Peh and her colleagues presented a paper at the Design Automation Conference in which they discussed the efficiency of an on-chip network and demonstrated the performance using a test processor. Instead of using an all-to-all connection, each core only connects to its nearest neighbors using on-chip routers, thereby reducing power requirements and increasing the scalability of the architecture.
The downside is that data from each core has to pass through each subsequent core router along the way to its final destination. Also, if two packets of data show up at a particular router at the same time, one packet has to be saved while the other one is being processed.
Despite such challenges, some manufacturers have already hopped off the bus. San Jose-based chipmaker Tilera, for example, employs an on-chip network in their manycore architecture. They currently offer 32 and 64-core processors and look to scale beyond 100 cores in the near future.
Intel also seems to be in on the trend. The company’s research lab has produced an experimental, 48-core processor named the “Single-chip Cloud Computer” (SCC). Although that’s just a plaything for researchers, the commercialization of Intel’s manycore MIC architecture, along with the recent acquisition of QLogic InfiniBand, could mean an on-chip network will soon be showing up on an x86 processor in the not-to-distant future.
Peh’s research suggests the bus architecture may be on its way out as processors delve into double-digit core territory. If research like that spurs chip vendors to design and build viable on-chip networks, it could usher in a new era of highly scalable processors.
Large-scale, worldwide scientific initiatives rely on some cloud-based system to both coordinate efforts and manage computational efforts at peak times that cannot be contained within the combined in-house HPC resources. Last week at Google I/O, Brookhaven National Lab’s Sergey Panitkin discussed the role of the Google Compute Engine in providing computational support to ATLAS, a detector of high-energy particles at the Large Hadron Collider (LHC).
The Xeon Phi coprocessor might be the new kid on the high performance block, but out of all first-rate kickers of the Intel tires, the Texas Advanced Computing Center (TACC) got the first real jab with its new top ten Stampede system.We talk with the center's Karl Schultz about the challenges of programming for Phi--but more specifically, the optimization...
Although Horst Simon was named Deputy Director of Lawrence Berkeley National Laboratory, he maintains his strong ties to the scientific computing community as an editor of the TOP500 list and as an invited speaker at conferences.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.