Expect a lot of the talk at SC14 this year to revolve around big data. Ari E. Berman, Ph.D., Director of Government Services and Principal Investigator at BioTeam, Inc., attempts to cut through the noise with a recent post on the Intel blog page reflecting on how big data has upped the computational ante for life sciences.
“The biggest issue right now is the computational infrastructure needed to get to that mythical Big Data discovery place everyone talks about,” writes Berman.
This suggests “…the ability to take the sum total of data that’s out there for any particular subject, pool it together, and perform a meta-analysis on it to more accurately create a model that can lead to some cool discovery that could change the way we understand some topic.”
But with petabytes or more of data, realizing this vision requires properly converged infrastructure, without which “most people will spend all of their time just figuring out how to store and process the data, without ever reaching any conclusions,” according to Berman.
Like many other disciplines, the life science community has seen a sharp rise in data that shows no signs of slowing. Along with the proliferation of laboratory equipment, including next-generation sequencers (NGS) and high-throughput high-resolution imaging systems, there is the need for HPC technologies to process the enormous data streams. The coming era of personalized medicine will further strain the computational limits of the life science and biomedical field. Despite the difficulty of hitting this moving target, Berman contends that life science research is up to the challenge.
As evidence of this, he cites three positive trends:
1. Science DMZs – specialized research-only networks that prioritize fast and efficient data flow.
2. A balance of local compute with hybridized public cloud infrastructures.
3. Low-cost commodity HPC/storage.
“We are at the stage now where most researchers spend a lot of their time just trying to figure out what to do with their data in the first place, rather than getting answers,” Berman states. “However, I feel that the field is at an inflection point where discovery will start pouring out as the availability of very powerful commodity systems and reference architectures come to bear on the market.”