HPCwire

Leading HPC
Solution Providers
HPCwire >> Topic >> Storage

Startup Launches Highly Parallel Storage System


Atrato, Inc., a new storage vendor, emerged from stealth mode this week to unveil the company's first product: the Velocity1000 (V1000) storage system. The new product offers 11,000 Input/Output operations per second (IOPS) and 50 terabytes of raw disk capacity in a 3U rack. The company relaunched itself back in February, when it changed it name from Sherwood Information Partners to Atrato and announced $18 million in funding. It also revealed some of its big name backers, including Jesse Aweida, founder, former president and CEO of StorageTek; Tom Porter, formerly CTO of Seagate; Gary Gentry, SVP Maxtor, Seagate; and Dick Blaschke, an IBM and EMC veteran.

The V1000 product is a unique storage appliance aimed at the high performance computing, digital entertainment and web sectors, where I/O performance and cost are big driving factors. The offering is designed to address a growing problem in high-end storage: the imbalance between storage capacity and I/O performance. Capacity is doubling every 18-24 months. But access speeds -- data transfer rate, seek time and rotational latency -- are only increasing around five percent per year. With capacity increasing exponentially and access speeds increasing linearly, application responsiveness is suffering. This is especially true for applications that have a random data access pattern.

Atrato's goal was to create a high performance, highly dense, but energy-efficient storage system. It uses its patented Self-maintaining Array of Identical Disks (SAID) technology to construct a highly dense, sealed enclosure that is guaranteed to be maintenance-free for at least three years (more about how this is done in a just a moment). The company also claims it can achieve all this with much lower energy use than conventional storage. "At a given performance level, we use 80 percent less power than commercially available systems today, whether it's NetApp, LSI or DataDirect Networks," says Dan McCormick, Atrato co-founder and CEO. The company says a V1000 setup can deliver 17.3 IOPS/Watt versus a typical industry figure of 4 IOPS/Watt.

The majority of the power savings come from the building blocks of the disk enclosure. Instead of using 3.5 inch enterprise-class SATA or SAS disks, Atrato engineers decided to use lots of mobile-class 2.5 inch SATA disks in their drive enclosure. Mobile SATA disks are built for power constrained platforms like laptop computers, but tend to be lower capacity -- 100 GB to 320 GB. The smaller size of the disks compared to their enterprise counterparts actually contributes to their energy efficiency, since the moving parts don't have to travel as fast or as far.

The overall approach is to use mass parallelization of these relatively small disk drives to construct a more efficient system. This is analogous to the manycore approach for processors, where lots of simpler, less powerful cores are used to build a high performance computer. In the case of the V1000, the more granular storage model improves random access performance and energy efficiency at the same time. Efficiency is also increased by the system software, which manages data placement on the disks in order to optimize the seek operations.

The Atrato engineers had to overcome a number of drawbacks of mobile-class drives to make the system reliable. In general, these devices exhibit rotational vibration instabilities. When they get too close to each other, drive performance can drop by 70 percent or more. Heat and signal integrity can also become a problem when they're packed closely together. The engineers were able to design proprietary drive packaging that circumvents these problems. According to McCormick, with this special packaging, they've been able to derive enterprise-class performance from mobile-class hardware.

The company guarantees three years of maintenance-free operation for their enclosure, with no disk replacements required. By contrast, in a conventional enterprise setup, when a drive fails, a light blinks on the front panel and then a call is made for somebody to come out and perform the drive replacement (hopefully the worker pulls the right one and doesn't bring the system down in the process). In most cases, when the offending drive is sent back to the factory, no problem is detected. "We take that same process and move it inside the box," says McCormick.

Within the SAID enclosure is a virtual spare -- extra capacity that is ready in case of a drive failure. In fact, at any given time, the system is replicating data from 15 to 20 of the most suspect drives. So when a failure occurs, the drive is taken off-line and put in the "drive hospital." Diagnostics are used to determine what's wrong. In many cases, the error can be isolated and the drive can be put back online. During that time, the user is unaware anything has happened, since there has been no interruption of service or performance hit.

By anticipating drive failure, McCormick says they've been able to eliminate any single point of failure. Even if the hardware is beyond redemption, the drive just remains off-line for the life of the product and the software works around it. The system employs a variety of RAID technologies (RAID 5, 6, 10 and 50) as well as its predictive rebuild technology to support this level of reliability. According to testing done by Atrato engineers, they've been able to empirically model a three to five year time frame for sustaining the product's performance and reliability.

As in any redundant storage system, a certain amount of capacity has to be sacrificed. The system allows users to configure trade off capacity with some level of reliability. At the high end, McCormick says as much as 80 to 90 percent of the raw storage capacity is available to the user. More conservative users can drive the usable capacity down to 50 percent or even lower if they choose to maintain the highest levels of reliability. For many customers, this would be a reasonable tradeoff, since storage capacity is cheap and getting cheaper, while drive maintenance costs are exactly the opposite.

Atrato's initial customers are likely to be users that have strict performance and power requirements and/or require maintenance-free operation. The first customer announced this week is SRC Computers. They have integrated the V1000 into a system for a government sector customer who needed high levels of random access performance. That SRC system achieves 20,000 IOPS with 14 terabytes of usable capacity.

Atrato is not announcing its pricing at this point, but McCormick says that a 20 terabyte (raw capacity) system starts somewhere in the $150K range. There are certainly less expensive storage systems out there on a price/gigabyte basis (based on high-capacity 3.5 inch SATA), but on a dollar/IOPS basis, the highly parallelized Atrato architecture gives the V1000 the edge.

In fact, the company isn't going head to head against mainstream enterprise storage solutions. Systems with really big storage tend to be used by applications that don't need extreme levels of I/O performance or are accessing data sequentially on the disk. Atrato's niche is where near-instantaneous data transactions are required. McCormick sees his competition as the emerging technologies of flash disk and RAM-based external storage. At this point, he thinks those technologies are not quite ready for prime time because of a combination of price, performance and reliability issues. But, he says, when solid state drives make sense, they'll be happy to bring them into their product line.


Article Tools

  • Print This Article
  • Contact the Author

Share & Save Options

Discussion

There are 0 discussion items posted.  



Feature Articles

The Week in Review

UPenn adds third state to nanowire storage; and UIUC is named the first CUDA Center of Excellence. John West recaps those stories and more in our weekly wrap-up.
Read More...

IBM Looks to Tap Massive Data Streams

Modern civilization is positively drenched in data, some of which needs to be dealt with in real time to be of any value. Businesses, especially in the financial industry, have long recognized this, and have been building custom systems to collect, analyze, and react to information as it is captured. IBM thinks the time is right to generalize these approaches into a new field of computing -- and a new business -- it calls stream computing.
Read More...

Gravity Attracts a GigE HPC Cluster

Not all supercomputing rides on InfiniBand or proprietary interconnects. For technical applications that decompose neatly into loosely-coupled threads, a big cluster with vanilla Gigabit Ethernet does just fine. The top Ethernet system on the TOP500 list -- at number 58 -- is the new ATLAS cluster at the Max Planck Institute for Gravitational Physics in Germany.
Read More...

Top Headlines

San Diego Gets Set for Storage Explosion

Jul 03 | Byte and Switch | The San Diego Supercomputer Center, which provides much of the core storage for the TeraGrid, is overhauling its 28 petabyte storage system to support tremendous data growth. Read more...

Intel's Gelsinger Predicts Intel Inside Everything

Jul 03 | ExtremeTech | Intel exec Pat Gelsinger said he sees the Intel Architecture permeating virtually every segment of computing, as the company's microprocessors expand into more and more cores. Read more...

A Massively Parallel Future

Jul 03 | Bangkok Post | The latest programmable GPUs are starting to steal application cycles from CPUs. Read more...

UCSD Researchers Identify Potential Bird Flu Drugs

Jul 02 | UC San Diego News Center | With the help of resources at the San Diego Supercomputer Center, UCSD scientists have isolated more than two dozen promising compounds from which new “designer drugs” might be developed to combat the avian flu virus. Read more...

Implementing Multi-Core: The Devil Is in the Detail

Jul 02 | Chip Design Magazine | Dual- and quad-core processors barely scratch the surface of the potential of multi-core systems. Read more...

Featured Whitepapers

New HPC White Paper: Star-P® Performance on IBM Linux Clusters

Jul 03 | | The paper explores some of the performance benefits of Star-P on commodity scalable systems such as IBM's Linux clusters based on multi-core Intel Xeon processors. The results demonstrate substantial performance gains with almost no programmer effort-roughly a 24-fold speed improvement for solving linear matrix equations. An overview of parallel computing with Star-P, a description of the performance test cases and description of IBM cluster configurations used for testing are also addressed.

Fast N-Body Simulation with CUDA C Compiler

Apr 17 | | An N-body simulation numerically approximates the evolution of a system of bodies in which each body continuously interacts with every other body, and it arises in many other computational science problems as well.

Improving Performance and Manageability for Seismic Processing and Imaging Applications with Parallel Storage

Jun 05 | | As pressure increases on the upstream seismic processing community to deliver ever-higher levels of productivity and efficiency, a new generation of storage solutions will be required that allow the maximum utilisation of high-performance computing (HPC) Linux cluster resources, together with the minimum of management overhead.

Multimedia

Podcast: Interview with Ben Bennett of ClearSpeed Technology

Today, HPC organizations are requiring substantially more floating point performance to solve real-world problems. In this podcast, Ben Bennett, ClearSpeed General Manager, discusses how acceleration technology can improve the overall performance of standard x86-based systems...

ISC'08

Newsletters

Stay informed! Subscribe to HPCwire email Newsletters.

Get updates and insights on the High Productivity Computing industry delivered driectly to your inbox.






Featured Events

HPC Job Bank