The third annual Big Data and Extreme Computing conference gathered in Barcelona last month, bringing more than 100 experts from around the world together to report on ground-breaking research at the intersection of big compute and big data. As with last year, the event played host to a wide-range of relevant and timely presentations, including one from nearby Barcelona Supercomputing Center on the RETHINK big initiative and the significance of co-design.
Barcelona Supercomputing Center’s Osman Ünsal spoke as a representative of RETHINK big, which was funded by the European Union’s Seventh Framework Programme with a budget of over 1.9M€. The two-year project launched nearly a year ago on March 1, 2014, to set a roadmap for big data hardware and networking technologies in Europe. By bringing together key European hardware, networking, and system architects with the key producers and consumers of big data, the project seeks to create a path to advanced data analytics with the goal of maximizing European competitiveness over the next decade.
At the BDEC for Europe workshop, a Europe-centric meeting held right before the main BDEC event, Ünsal shared about the progress that RETHINK big has made as it approaches the one-year mark, and the developments that came out of the Working Group Meeting in September 2014, the Synthesis Workshop in December 2014, as well as collaborative activities with the European Union’s Big Data Value Public Private Partnership.
Ünsal also spoke about a core element of RETHINK big, co-design, and the necessity for taking hardware into account when designing software and vice versa. After all, the goal of Rethink Big is to advance the hardware and networking side of data analytics while making sure the applications, algorithms and systems evolve in a concurrent manner.
In a position paper published in the event proceedings, Ünsal says the Moore’s law slowdown is driving the push to new hardware technologies and software designs, but it’s a mistake not to do these in tandem. Disregarding either side of this paradigm is a recipe for trouble. A hardware example that Ünsal cites is the Cell processor. The master-Slave co-processor model made famous for powering the world’s first petascale supercomputer was notoriously difficult to program. He also cites the Intanium processor with its Very Long Instruction Word (VLIW) processor and a compiler that could not get sufficient parallelism.
There have been similar mismatches on the software side when the software isn’t “hardware conscious.” Here Ünsal cites the top two Terasort platforms for sorting 100TB data: a Hadoop platform and Tritonsort, which is optimized for hardware, and is written in C. The number one platform runs on vanilla Hadoop, and is thus easy to program, but it needs 57X more cores, 100X more memory, and only gets 2X the performance compared to the number 2 platform, Tritonsort, which is optimized for hardware, and is written in C.
Ünsal’s paper goes on to discuss the hardware advances that BSC is pursuing to support data-intensive workloads, with a focus on emerging distributed/dataflow programming models.
“On one side, we foresee much more frequent use of 3D stacking to help solve the bandwidth and capacity issues,” he writes. “Stacking memory on top of logic will bring us closer to a processor-in-memory (PIM) type of architecture, at least in philosophy, and PIM architectures are a good fit for programming models which exploit locality and runtime-managed movement of data as well as migrating computation to data.”
With more bandwidth enabled through 3D stacking, they are looking at ways to exploit this easing of the memory I/O bottleneck, perhaps with vector processors, which are known to leverage bandwidth very efficiently.
Ünsal says BSC researchers also see a path forward for non-volatile memory technologies, such as memristors and STT-MRAM, which are attractive for their density, persistence and low leakage. “These densities are higher than current magnetic disk technology,” he adds, “and it implies that in the future all storage including disk and main memory might be composed of these non-volatile memories. This makes complicated file systems unnecessary and increases the relevance of data-driven task based programming models.”
BSC researchers are also exploring using emerging non-volatile memories in tandem with 3D stacking. As Ünsal points out, current 3D stacking implementations are limited by thermal dissipation caused by increased leakage. Emerging non-volatile technologies being virtually leak-free would solve the thermal issue and maintain the increased memory. “This non-volatile 3D stacked multi-core processor-in-memory will facilitate a new architecture where object-based storage is a first-class citizen,” he writes.