HPCwire

The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing

HPCwire >> Features

Panasas Invents 'Tiered Parity'


In 1988 Garth Gibson at the University of California, Berkeley, co-authored a paper titled "A Case for Redundant Arrays of Inexpensive Disks (RAID) [PDF]," which outlined the basic principles of using big, cheap disks to increase data reliability and I/O performance. RAID went on to become a widely adopted storage technology throughout the industry, while Gibson co-founded Panasas Inc., a storage cluster vendor for high performance computing applications.

This week, Gibson and company claim that they have implemented the most significant extension to disk array data reliability since the original RAID paradigm was developed. Their new architecture is called "tiered parity." In this model, Panasas has built "vertical parity" and "network parity" on top of their existing RAID 5 "horizontal parity" implementation.

The RAID 5 approach, as it was outlined in the original paper, consists of striping data and parity across multiple disks. It enables error recovery for single disk failures and increases performance via parallel reads and writes. This technology is widely used in storage systems today. Panasas' own implementation of RAID 5, called "ObjectRAID," is based on storage objects rather than blocks. The added intelligence is designed to reduce reconstruction times when a disk failure occurs.

But no RAID 5 technology can handle a media error, also know as an unrecoverable read error (URE), if it occurs during reconstruction of a failed disk. When this occurs, the RAID data cannot be rebuilt from disk; a backup (usually on tape) has to be used to recover the entire array. Ten years ago, this wasn't a serious problem. With 50 GB SATA disk drives, a media error was very unlikely to occur while reading a single disk, since the rate of failure is about one error every 10^14 bits (12.5 terabytes), a rate that has remained constant for over a decade. And when a media error did happen to occur during reconstruction, a 50 GB disk took only a few hours to recover from tape.

Times have changed. Disks have become much bigger and denser. Capacities of 500 to 750 GB are common today, and one terabyte disks will soon be the norm. That means when a disk goes south, the odds of hitting a media error during recovery are much greater, and recovery from tape can take days or weeks.

Imagine a RAID array of seven 1 TB disks. When one disk fails, the chances of hitting a URE while recovering the data from the six remaining disks is now about 50/50. When two terabyte disks hit the market in 2009, the disk failure plus media error scenario becomes almost a sure bet. Recovering the storage array from backup tape could take a month. For high end computing applications that use tens or hundreds of terabytes of data, this would be a disaster.

"I think what people are becoming aware of is that the data integrity provided by RAID 5 is basically no longer sufficient," says Robin Harris, senior analyst at Data Mobility Group. "RAID 5 will only protect across a single disk failure, so it's going away as a [standalone] data protection strategy."

To address this problem, Panasas invented vertical parity. Essentially, they've added RAID within each disk, by generating a parity sector from the other sectors. The local parity sector can be used to recompute the missing data in case of a media error. According to Panasas, vertical parity gets the error rate down to between one in 10^18 and one in 10^19, which is 1000 to 10,000 times better than the URE rate. The extra parity information uses 10 percent of the disk capacity, but Panasas claims there is no performance hit. So scalability is built in.

A word here should be said about RAID 6 technology (also known as double parity), which some vendors use for an additional level of data protection. This scheme was designed to guard against a double disk failure, which it does. Sort of. The problem is that RAID 6 doesn't protect against subsequent media errors after the second disk goes down, which, as discussed above, is becoming increasingly more likely. Here, it has the same problem as RAID 5. However, RAID 6 can be used to recover from the single disk failure plus media error scenario. But the performance hit for dual parity compared to single parity is significant. So it's a mixed bag and doesn't directly address the media error problem.

On top of its horizonal and vertical parity schemes, Panasas has added an additional layer of network parity protection. At this level, parity checking is done on the client side, to make sure the data delivered by the storage system wasn't corrupted on its way to the user. Because of increasing I/O bandwidth and the number of hardware and software components between the external data and the application, there are increasing opportunities for good data to go bad. Firmware, server hardware, server software, network components and transmission media can all potentially mangle valid data unbeknownst to the application. With network parity, the client receives an error notification when bad data is detected.

The tiered parity technology will be included in the next version of Panasas' ActiveScale operating environment, version 3.2. The beta will be out next month and will be generally available by the end of the year. The additional parity levels can be turned off if the user believes they're not needed for a particular environment. According to Panasas, the tiered parity technology doesn't exact a performance hit on top of the existing RAID 5 implementation, but, as stated above, the vertical scheme does eat an additional 10 percent of the storage -- that's in addition to the 10 percent used by the RAID 5 implementation.

Although the overall concepts of the three-tiered architecture are fairly general, Panasas is attempting to protect its new invention. "We actually have a patent pending on this tiered parity concept, particularly the vertical parity," says Larry Jones, VP of Marketing at Panasas. "Could someone copy it? Who knows? But we are trying to protect this specific idea."


HPCwire on Twitter

Article Tools

  • Print This Page
  • Bookmark This Article

Share Options

(Digg, Technorati, more)


Subscribe

Discussion

There are 0 discussion items posted.  

HPC in the Cloud Part 2
People to Watch 2010


Top Headlines

Australia Commissions Cray Supercomputer

Mar 19 | OfficialWire | New super to support intelligence work Down Under. Read more...

Intel Partners See 'Easy' Upgrade Path With Xeon 5600 Chips

Mar 18 | ChannelWeb | Westmere parts already showing up in HPC machines. Read more...

AMD: OEMs primed for Opteron 6100s

Mar 17 | The Register | But what about the tier ones? Read more...

Arrival of the Desktop Supercomputer

Mar 17 | Cadalyst Magazine | A new generation of workstations is changing the nature of technical computing. Read more...

Scheduling HPC In The Cloud

Mar 17 | Linux Magazine | Latest iteration of Sun Grid Engine able to tap into Cloud. Read more...

Featured Whitepapers

Virtualization for Aggregation And The vSMP Architecture™

Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.

Copper Cable Technologies for High Performance Computing

Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.

Multimedia

Webcast: Virtualized Data Center Roundtable

Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.

Webcast: Watch SC09 Birds of a Feather Video: Scalable Fault-Tolerant HPC Supercomputers

Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.

Webcast: High Performance Computing for a Smarter Planet

LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html

SC09 HPC in the Cloud

Newsletters

Stay informed! Subscribe to HPCwire email Newsletters.






HPC Job Bank


Featured Events

HPC User Forum DICE
2010 High Performance Computing Linux Financial Markets
Cloud Computing Expo
Cloud Lab
ESC
DEISA PRACE Symposium