A team of engineers from North Carolina State University and Intel have joined forces to address on-chip communications bottlenecks that hamper performance scaling for workloads that share data frequently. Their paper, “CAF: Core to Core Communication Acceleration Framework,” describes a novel core-to-core Communication Acceleration Framework (CAF).
The technique could provide relief for many HPC applications and other communications-depending workloads such as pipelined packet processing, widely used in software-based networking.
The team’s approach was to take the software instructions responsible for communicating requests between cores and hardcode them into a small logic module, called a queue management device (QMD), which is attached to the network on chip.
The communication acceleration framework (CAF) achieves a 2X performance or better compared with traditional software queue implementations. Most impressive, speedup improved with the addition of cores. At 16 cores, QMD was 20X faster, according to an IEEE article. The QMD can also be used for offloading basic computational functions, providing additional efficiencies. The technique was even shown to benefit MapReduce.
Having seen what the offloading device can do, the researchers are identifying more opportunties for accelerating multicore computations. “The challenge is to figure out which software is used frequently enough that we could justify implementing it in hardware. There is a sweet spot,” said Yan Solihin, a professor of electrical and computer engineering and paper co-author.
The paper, “CAF: Core to Core Communication Acceleration Framework,” is being presented at the 25th Annual Conference on Parallel Architectures and Compilation Techniques (PACT 2016), in Haifa, Israel, this week (Sept. 11-15). The lead author is Yipeng Wang, former Ph.D. student at NC State. Coauthors include Yan Solihin of North Carolina State University and Ren Wang, Andrew Herdrich and James Tsai of Intel Corporation.
The official announcement can be read here.