From the Editor | Main Blog Index
September 03, 2008
Netezza is a six year-old company that's been on the edge of my radar screen for awhile. That's mostly because they sell data warehousing appliances -- not exactly my idea of mainstream high performance computing. But what the company really does is marry data warehousing with streaming analytics. And it does it in a sort of sexy way, geek-wise.
In a conventional data warehousing setup, you have a database stored on a SAN, which is connected to a mainframe or more likely, a compute cluster. Processing takes place after the data is loaded from the storage hardware onto the computing hardware. In a transactional database app, this is fine and dandy, since the data volumes and the amount of processing are usually not stressed by the limits of the network's bandwidth and latency.
In streaming applications, lots of data must be processed in real time or close to it. And when I say lots of data here, I'm talking terabytes. This type of software is most commonly associated with "business intelligence," but streaming apps encompass an even wider range -- everything from data mining to financial analytics to intelligence gathering. In this environment, the compute-storage links can easily become a communication bottleneck. Netezza appliances attempt to rectify this by placing the storage and compute pieces in close proximity and by providing a streaming framework for the applications. Here's how the company describes it:
Rather than shuttling data between disk and memory for processing once a query comes in, which creates the bottleneck, data streams off the disk and through query logic loaded into an FPGA (field programmable gate array). The FPGA and processor (a PowerPC chip), together with 400 GB of disk storage, reside on each of the massively parallel nodes that Netezza calls snippet processing units (SPUs). Each of our Netezza racks contains 112 of these SPUs. Queries are optimized across the SPUs for maximum performance and power efficiency A Linux host server aggregates SPU results and manages query workload and the results are returned to the user.
Not exactly a commodity solution. But unlike a lot of other vendors peddling unique HPC solutions, Netezza has managed to attract some big name customers including Amazon, AOL, The American Red Cross, CNET Networks, Nationwide Financial Services, Sandia National Laboratories, and the US Army Corps of Engineers. All told, the company has collected 58 customers.
On Wednesday, Netezza announced five new applications for their platform:
The new apps are the result of the Netezza Developer Network (NDN), a program the company launched in September 2007. The idea was to attract developers to write analytic applications for the Netezza platform.
Since many data warehousing applications are evolving from an online transaction processing (OLTP) model to an online analytics processing (OLAP) one, Netezza may have hit the market at the right time. The company is not alone, however, and has to deal with larger vendors like IBM, HP, Oracle, SAS, as well as a posse of smaller firms like Teradata and Greenplum. So far so good, though. Netezza went public in July 2007 and for the last two quarters has reported profits and growing revenue. In an industry that has mostly punished vendors that dared to offer non-commodity solutions, Netezza may be a refreshing exception.
Posted by Michael Feldman - September 02, 2008 @ 9:00 PM, Pacific Daylight Time
![]()
Michael Feldman is the editor of HPCwire.
No Recent Blog Comments
Large-scale, worldwide scientific initiatives rely on some cloud-based system to both coordinate efforts and manage computational efforts at peak times that cannot be contained within the combined in-house HPC resources. Last week at Google I/O, Brookhaven National Lab’s Sergey Panitkin discussed the role of the Google Compute Engine in providing computational support to ATLAS, a detector of high-energy particles at the Large Hadron Collider (LHC).
Read more...
The Xeon Phi coprocessor might be the new kid on the high performance block, but out of all first-rate kickers of the Intel tires, the Texas Advanced Computing Center (TACC) got the first real jab with its new top ten Stampede system.We talk with the center's Karl Schultz about the challenges of programming for Phi--but more specifically, the optimization...
Read more...
Although Horst Simon was named Deputy Director of Lawrence Berkeley National Laboratory, he maintains his strong ties to the scientific computing community as an editor of the TOP500 list and as an invited speaker at conferences.
Read more...
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
Read more...
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
Read more...
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
Read more...
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
Read more...
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.