HPCwire

The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing

HPCwire >> Features

IBM Unveils Enterprise Stream Processing System


Page:  1  of  2
1 | 2   All  »  

On Tuesday at the Security Industry and Financial Market Association (SIFMA) Technology Management Conference in New York, IBM announced System S, a software framework that uses a stream processing model to support a new class of applications. The result of a $5 million initiative at IBM Research, System S is designed to perform real-time analytics using high-throughput data streams.

The company will initially aim this technology at Wall Street trading applications, but the system is generally applicable to all kinds of real-time intelligence gathering. Relevant domains include surveillance, manufacturing, inventory management, public health, and biological research, among others. At this point, the System S technology is more than a prototype, but less than a product. This week's announcement is aimed at garnering interest from Wall Street firms that might want to partner with IBM to develop commercial applications.

The System S software is designed to run in a heterogeneous hardware environment, taking advantage of x86, Cell, Blue Gene, or even Power-based servers. Cell-based systems, in particular, appear to be a well-suited for these types of applications because of that processor's natural abilities as a stream computing platform. Suitable platforms can range from a single CPU up to 10,000 servers. The initial version of System S is targeted to IBM BladeCenters running Red Hat or SUSE Linux. According to IBM, in larger configurations, System S is capable of processing in the neighborhood of a million messages per second, depending on the application behavior and the nature of the data streams.

The intention of the framework is to host applications that turn heterogeneous data streams into actionable intelligence. The source of such streams could be manufacturing sensors, television broadcasts, market exchange streams, phone conversations, video feeds, email traffic, and so on. Essentially, the system works by enabling different types of software processing elements (PE) or modules to be strung together to act on data streams. The system exposes the profile of each processing element to the others in the framework so they can interoperate. The software contains an "Omniscient Scheduler" that ensures the data pipelines between the PEs operate efficiently. A user hypothesis or query drives the application and specifies the kinds of data correlations to be performed.

For example, if one were searching for a certain subject matter in conversations being conducted over a secure telephone line, this would require a number of stream processing elements. The first step would be to pass the communication feed into a data decryption PE, which would produce decrypted audio. Then, using a speech recognition PE, the audio stream would be converted into text. Next, the text data would pass through a semantic analyzer PE to identify those conversations that contained content of interest. If one was processing many such conversations, the system could automatically focus on those that met the specified criteria and drop the remainder. A more complex application with additional data feeds could be accommodated by plugging in the appropriate PEs.

According to Nagui Halim, director of high performance stream computing at IBM, System S represents a significant departure from current intelligence extraction, which traditionally relies on fixed-format data that has been stored on a disk somewhere. This model can only provide a retrospective look a problem. By contrast, System S applications are able to take unstructured raw data and process it in real time. And rather than performing simple data mining or recreating a simulation of some well-defined structure or process, System S applications attempt to make correlations and generate some type of prediction. In addition, the system is supposedly capable of refining its behavior over time by learning from the successes and failures of past correlations.

"This is about what's going to happen," explains Halim. "The thesis is that there are many signals that foreshadow what will occur if we have a system that is smart enough to pick them up and understand them. We tend to think it's impossible to predict what's going to happen; and in many cases it is. But in other cases there is a lot of antecedent information in the environment that strongly indicates what's likely to be occurring in the future."

To Halim's surprise, in his research he found that streaming data analytics was a much better tool than he expected for many classes of applications. He discovered that events are often very predictable if one examines the correct data. For example the occurrence of a "perfect storm" is the result of a number of more subtle conditions which build up over time that interact to produce a big event.

If successfully implemented, predictive systems certainly have a high value for a range of enterprises and government organizations. This is especially true in the financial services industry, where accurate forecasts of options and derivatives pricing can translate directly into profits. Being able to correlate market activity with the effects of qualitative data, like news events, would open up some interesting avenues for financial trading application. IBM envisions algorithmic trading engines connected to media feeds such as CNN and Al Jazeera to correlate news reports with financial market behavior. For example, an application could be set up to look for events that could precipitate an oil price spike in the next ten minutes.

An application could also be devised to search for rogue traders or money laundering activities. Traditionally this is accomplished by examining account histories and performing manual inspection of suspicious transactions. But this sort of retrospective analysis may allow the perpetrator to get way.

Page:  1  of  2
1 | 2   All  »  

HPCwire on Twitter

Article Tools

  • Print This Page
  • Bookmark This Article

Share Options

(Digg, Technorati, more)


Subscribe

Discussion

There are 0 discussion items posted.  

HPC in the Cloud Part 2
People to Watch 2010


Top Headlines

IBM Releases Energy Efficient Power7 System

Feb 09 | eWeek Europe | Company says new high-end servers will deliver "intelligent performance." Read more...

Inductive Coupling Packs Flash Drive in a Chip

Feb 09 | EE Times | Wireless technology promises energy-efficient chip-to-chip communication. Read more...

IBM, Microsoft Help Create Montana Supercomputer

Feb 08 | eWeek | A new kind of Rocky Mountain high. Read more...

AMD Aims for GPUs in Mainstream Servers Starting 2012

Feb 08 | Computerworld | Chip maker hopes to bring CPU-GPU processors to servers in two years. Read more...

Graphene Transistors That Work at Blistering Speeds

Feb 05 | Technology Review | IBM has created graphene transistors that leave silicon ones in the dust. Read more...

Featured Whitepapers

Virtualization for Aggregation And The vSMP Architecture™

Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.

Copper Cable Technologies for High Performance Computing

Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.

Appro Assists LLNL with Cluster Designed for Extreme Scale Visualization

Jan 11 | | LLNL is home to some of the fastest computers in the world. In 2012, LLNL expects to have the Sequoia supercomputing cluster operational with a projected performance of over 20 PFLOP/s. These systems will focus on strengthening the foundations of predictive simulation through running large suites of complex simulations and then comparing model predictions with experimental data. To visualize this project’s large amount of data, LLNL requested an Appro Supercomputing Cluster specifically designed to support interactive data analysis.

Multimedia

Webcast: Virtualized Data Center Roundtable

Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.

Webcast: Watch SC09 Birds of a Feather Video: Scalable Fault-Tolerant HPC Supercomputers

Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.

Webcast: High Performance Computing for a Smarter Planet

LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html

SC09 HPC in the Cloud

Newsletters

Stay informed! Subscribe to HPCwire email Newsletters.






HPC Job Bank


Featured Events

BrightTALK
HPCC
HPC User Forum DICE
Cloud Slam
Cloud Computing Expo
DEISA PRACE Symposium