HPCwire

Leading HPC
Solution Providers





















HPCwire >> Features

Panasas Invents 'Tiered Parity'


In 1988 Garth Gibson at the University of California, Berkeley, co-authored a paper titled "A Case for Redundant Arrays of Inexpensive Disks (RAID) [PDF]," which outlined the basic principles of using big, cheap disks to increase data reliability and I/O performance. RAID went on to become a widely adopted storage technology throughout the industry, while Gibson co-founded Panasas Inc., a storage cluster vendor for high performance computing applications.

This week, Gibson and company claim that they have implemented the most significant extension to disk array data reliability since the original RAID paradigm was developed. Their new architecture is called "tiered parity." In this model, Panasas has built "vertical parity" and "network parity" on top of their existing RAID 5 "horizontal parity" implementation.

The RAID 5 approach, as it was outlined in the original paper, consists of striping data and parity across multiple disks. It enables error recovery for single disk failures and increases performance via parallel reads and writes. This technology is widely used in storage systems today. Panasas' own implementation of RAID 5, called "ObjectRAID," is based on storage objects rather than blocks. The added intelligence is designed to reduce reconstruction times when a disk failure occurs.

But no RAID 5 technology can handle a media error, also know as an unrecoverable read error (URE), if it occurs during reconstruction of a failed disk. When this occurs, the RAID data cannot be rebuilt from disk; a backup (usually on tape) has to be used to recover the entire array. Ten years ago, this wasn't a serious problem. With 50 GB SATA disk drives, a media error was very unlikely to occur while reading a single disk, since the rate of failure is about one error every 10^14 bits (12.5 terabytes), a rate that has remained constant for over a decade. And when a media error did happen to occur during reconstruction, a 50 GB disk took only a few hours to recover from tape.

Times have changed. Disks have become much bigger and denser. Capacities of 500 to 750 GB are common today, and one terabyte disks will soon be the norm. That means when a disk goes south, the odds of hitting a media error during recovery are much greater, and recovery from tape can take days or weeks.

Imagine a RAID array of seven 1 TB disks. When one disk fails, the chances of hitting a URE while recovering the data from the six remaining disks is now about 50/50. When two terabyte disks hit the market in 2009, the disk failure plus media error scenario becomes almost a sure bet. Recovering the storage array from backup tape could take a month. For high end computing applications that use tens or hundreds of terabytes of data, this would be a disaster.

"I think what people are becoming aware of is that the data integrity provided by RAID 5 is basically no longer sufficient," says Robin Harris, senior analyst at Data Mobility Group. "RAID 5 will only protect across a single disk failure, so it's going away as a [standalone] data protection strategy."

To address this problem, Panasas invented vertical parity. Essentially, they've added RAID within each disk, by generating a parity sector from the other sectors. The local parity sector can be used to recompute the missing data in case of a media error. According to Panasas, vertical parity gets the error rate down to between one in 10^18 and one in 10^19, which is 1000 to 10,000 times better than the URE rate. The extra parity information uses 10 percent of the disk capacity, but Panasas claims there is no performance hit. So scalability is built in.

A word here should be said about RAID 6 technology (also known as double parity), which some vendors use for an additional level of data protection. This scheme was designed to guard against a double disk failure, which it does. Sort of. The problem is that RAID 6 doesn't protect against subsequent media errors after the second disk goes down, which, as discussed above, is becoming increasingly more likely. Here, it has the same problem as RAID 5. However, RAID 6 can be used to recover from the single disk failure plus media error scenario. But the performance hit for dual parity compared to single parity is significant. So it's a mixed bag and doesn't directly address the media error problem.

On top of its horizonal and vertical parity schemes, Panasas has added an additional layer of network parity protection. At this level, parity checking is done on the client side, to make sure the data delivered by the storage system wasn't corrupted on its way to the user. Because of increasing I/O bandwidth and the number of hardware and software components between the external data and the application, there are increasing opportunities for good data to go bad. Firmware, server hardware, server software, network components and transmission media can all potentially mangle valid data unbeknownst to the application. With network parity, the client receives an error notification when bad data is detected.

The tiered parity technology will be included in the next version of Panasas' ActiveScale operating environment, version 3.2. The beta will be out next month and will be generally available by the end of the year. The additional parity levels can be turned off if the user believes they're not needed for a particular environment. According to Panasas, the tiered parity technology doesn't exact a performance hit on top of the existing RAID 5 implementation, but, as stated above, the vertical scheme does eat an additional 10 percent of the storage -- that's in addition to the 10 percent used by the RAID 5 implementation.

Although the overall concepts of the three-tiered architecture are fairly general, Panasas is attempting to protect its new invention. "We actually have a patent pending on this tiered parity concept, particularly the vertical parity," says Larry Jones, VP of Marketing at Panasas. "Could someone copy it? Who knows? But we are trying to protect this specific idea."



Article Tools

  • Print This Page
  • Bookmark This Article

Share Options

(Digg, Technorati, more)


Subscribe

Discussion

There are 0 discussion items posted.  

Sponsored Links

White Paper: HPC in a Green and Modular Solution Building Block
Learn how the Appro GreenBlade™ System helps consolidate server, storage, network, power and simplified management capabilities in a single package while providing the performance-density, energy-efficiency and best ROI for your business.



Top Headlines

Cloudy With a Chance of HPC

Jul 01 | GenomeWeb Daily News | The popularity of cloud computing in the life sciences community was on full display at April's Bio-IT World conference. Read more...

HPC From the Beach

Jul 01 | Linux Magazine | How can getting to the ocean help with HPC computing? Read more...

DARPA Investigates Extreme Supercomputing

Jun 29 | GCN.com | Agency issues RFI for "Ubiquitous High Performance Computing" systems. Read more...

Supercomputers Go From Biggest to Cheapest

Jun 29 | Computerworld | The bottom of the TOP500 reveals the coming revolution in truly accessible high-end computing. Read more...

CPUs Gear Up For -- and Some Avoid -- Hot Chips

Jun 18 | EE Times | Parallel software also takes spotlight at Stanford confab. Read more...

Featured Whitepapers

Building High Performance Computing in a Green and Modular Solution Building Block

Apr 14 | | Many HPC IT departments are feeling the rising pressure to deliver more capacity computing and performance while trying to reduce the total cost of ownership. This white paper discusses how an environmentally-friendly and open-standards HPC building block based computing system using flexible interconnect options helps address capacity computing needs.

Multimedia

Webcast: Dell Expands HPC Access and Adoption with Intel Cluster Ready Program


Source: Addison Snell, GM/VP, Tabor Research; sponsored by Dell

Many organizations that could benefit from the use of HPC clusters find that it is complicated to get the systems up and running because of limited IT resources or the complexities of the clusters themselves. Learn how the Intel Cluster Ready program, for which Dell was an original partner, seeks to address this challenge for entry level and mid-range HPC users.

Video White Paper: Architecting a Better Network Storage Solution

BlueArc's Titan architecture represents an evolutionary step in file servers by creating a hardware-based file system that can scale bandwidth, IOPS, and overall data capacity well beyond conventional software-based devices. With its ability to virtualize a massive storage pool of up to four usable petabytes of tiered storage, Titan can scale with growing data requirements, offering a competitive advantage for businesses, researchers, or other enterprises seeking to better manage data growth while still ensuring optimal performance.

Webcast: HPC Development Solutions: Sun Studio & Sun HPC ClusterTools


Sun Studio Compilers and Tools and Sun HPC ClusterTools allow you to create high performance parallel applications for OpenSolaris, Solaris and Linux. Sun Studio Express 11/08 includes MPI performance analysis capabilities and full OpenMP 3.0 compiler support. Learn about all this and the latest in Sun HPC ClusterTools 8.1.

Special Feature: ISC'09

Newsletters

Stay informed! Subscribe to HPCwire email Newsletters.






HPC Job Bank


Featured Events


WORLDCOMP 2009
Data Mining Courses