IBM, which has bet big on Apache Spark as a kind of analytics operating system ($300 million investment), yesterday announced the first cloud-based development environment for near real-time, high performance analytics using Apache Spark and a variety of tools from IBM and others. According to IBM the new set of capabilities, available on IBM Bluemix cloud platform, permits data scientists to access and ingest data and deliver insight-driven models to developers.
Spark is usually thought of as an “enterprise” tool, but IBM has worked with traditional HPC users too as the lines between HPC and data analytics continue to blur. For example, IBM, NASA and the SETI Institute are working together and using Apache Spark and IBM analytics to analyze more than six terabytes of complex deep space radio signals to hunt for patterns that might identify the presence of intelligent extraterrestrial life.
According to IBM, SETI has been able to begin a new Stellar Pair Eavesdropping campaign, which enables the organization to look for potential communications between planets that might be orbiting in double star systems. “By extracting new features from millions of observations, researchers are able to use machine learning to classify signals and sharpen their focus for subsequent deep analysis on clusters of signals which are anomalous or outliers,” says IBM.
The new platform provides 250 curated data sets, open source tools and a collaborative workspace making it easier to rapidly develop applications that are infused with intelligence. Link to the full press release: http://www.datanami.com/this-just-in/ibm-launches-development-environment-apache-spark/