Sick of big data? Not so fast. The age of the data-centric system has just begun. Tackling this subject in a recent blog is Tilak Agerwala, Vice President of Data Centric Systems at IBM Research.
Agerwala observes: “Since the 1950s, the models studied by HPC systems have increased both in scale and in detail with ever more sophisticated users calling for – and planning on – increased computational power. This increase is expressed in the form of Floating Point Operations per Second or FLOPS. Aggregate installed FLOPS, as measured by the Top500, have increased exponentially since tracking began in 1993, going from less than 60 gigaflops to nearly 300 petaflops, today. And the demand for increased FLOPS is not likely to abate for the foreseeable future.
“However, we are now seeing a new trend emerge that will dramatically change how HPC system design moves forward – the emergence of data as the world’s newest, and arguably largest, natural resource.”
But HPC has always had its two sides. Problems were divided into compute- and data-centric camps. If big data isn’t new to HPC, then why all the fuss?
Agerwala continues: “Today’s businesses, academic institutions and government agencies continue to have a multitude of applications for HPC, from fraud detection in transaction-based systems like credit cards serving millions of users simultaneously, to computational simulations the human heart beating in real time cell by cell. But unlike 60 years ago, these models now support a trillion fold more data – leaping from kilobytes to petabytes, and growing.”
Adding a trillion-fold more of anything to a system is sure to obviate the status quo.
“This onslaught of this structured and unstructured data on HPC requires a flexible computing architecture capable of addressing the growing needs for the workloads this data scale demands,” he says. “In order to prepare for the future, IBM has adopted Data Centric Systems (DCS) design as our new paradigm for computing.”
In 2011, IBM supplied Lawrence Livermore National Lab with the BlueGene/Q supercomputer, called Sequoia. Under Project Cardioid, Sequoia enabled the simulation of an entire human heart. Whole system modeling is having a remarkable impact on science, medicine and industry as well, but IBM is looking even farther ahead at a time when data centric systems will be “Systems of Insight.”
Data is the key to higher-resolution models, that draw on multiple simultaneous, and coupled simulations. But getting to this next stage will require 100x more compute power than the best systems of today, as well as a data-centric approach.
“True Data Centric Systems will be realized when we can consolidate modeling, simulation, analytics, Big Data, and machine learning and cognitive computing as ‘Systems of Insight,'” contends Agerwala. “They will provide these capabilities faster with more flexibility and lower energy costs, effectively ushering in a new category of computers.”