Since 1986 - Covering the Fastest Computers in the World and the People Who Run Them
April 21, 2010
HPC storage vendor Panasas has launched PAS HC, a high capacity HPC storage solution for active archives and massive primary storage. The offering is aimed at organizations whose active data is expanding into the petabyte realm, outgrowing their traditional NAS or direct attached storage setups.
PAS HC is just the latest offering in Panasas' NAS storage line. PAS, by the way, is the company's new shorthand for all the Panasas ActiveStor lines, although, in this case, the architecture of PAS HC is quite different from that of the existing PAS 7, 8, and 9 series products. The latter are blade-based servers that use SATA drives, while the HC is implemented as a rackmount Nehalem-based storage system filled with Fibre Channel JBODs -- a first for Panasas.
The glue that ties all the PAS'es together is Panasas' home-grown parallel file system, PanFS. The idea is to abstract file access from the underlying hardware, and, perhaps more importantly, from the function of the different storage pools. With PanFS, a single global namespace can be created that spans multiple Panasas storage silos. So, for example, you can have very high performance storage for scratch space, robust NFS storage for home directories, and low-cost, scalable storage for archives, all under one file system.
"It's a single mount point," explains Larry Jones, vice president of marketing at Panasas. "All you have to do to move between a scratch space where you're running a particular experiment and the archives where you have last year's experiments is to change directories."
In the past, less active data (like the raw data for a simulation) often got dumped to tape after it was initially processed. But workflows are often circular rather than linear, so for many applications, the raw data need to be continually accessible. A lot of technical computing projects have gotten so large that it's not feasible to restore from tape every time a user wants to re-crunch the numbers. For petascale-sized datasets, this could take days or even weeks. Dumping data to cheap disk-based RAID devices is no solution either. The capacity of such systems is too small to act as a unified pool, so users end up creating multiple archives as disks fill up.
For example, an oil and gas company might process 100-300 terabytes of seismic data, and turn it into 20 to 40 TB that is subsequently funneled into seismic interpreters. This process is repeated for multiple seismic datasets. Ideally, what these companies want to do is keep both the raw and intermediate data around so they have the flexibility to rerun the models with different parameters. In workflows like this, the advantage of having all the data under a global namespace becomes apparent.
Panasas says the PAS HC solves the too-slow and too-fragmented dilemmas by marrying PanFS and ActiveStor-level performance with lower-cost and denser capacity of JBODs. A single HC RAID controller can deliver 5 GB/sec while reading and 3 GB/sec while writing. (Those numbers reflect file I/O, not block I/O.) Each 4U JBOD enclosure has room for 60 drives. At 2 TB per drive, that works out to 120 TB per shelf and a maximum of 960 TB per rack.
Because of the storage density, Panasas says it can offer the an HC setup at about a dollar per GB, or even less when discounted through a channel partner. That's certainly a lot more expensive than tape storage, but for a high-performance disk-based set-up, fronted by a parallel file system, it's relatively cheap.
Panasas has already sold a few PAS HC systems. One is destined for an oil and gas firm and another for an unnamed government agency. The only public deployment at this date is the system being deployed at Los Alamos National Laboratory, to support its nuclear security mission.
The PAS HC at Los Alamos will be used in conjunction with the Cray "Cielo" Baker-class super being installed in the second half of 2010. According to Jones, though, 6 PB of HC storage is already in place, with plans to expand to 12 PB. Performance will top out at 160 GB/sec of aggregate throughput. The new system is intended for active archiving, primary storage, as well as a scratch file area for checkpointing Cielo runs. The Panasas gear joins 2 PB of other ActiveStor storage (mostly PAS 8) at the lab.
The PAS HC release comes on the heels of a good year, revenue-wise, for Panasas. Despite the recession, the company recorded 25 percent year-over year growth in 2009, with Q4 being the best quarter in the company's history. The first quarter of 2010 started with 46 percent year-over-year growth. With storage demand seemingly unsatiable, Panasas is hoping PAS HC keeps that momentum going.
(Digg, Technorati, more)
Sep 03 | Should engineers take advantage of GPU computing? Read more...
Sep 02 | Could see first products in three years. Read more...
Sep 01 | A hand-picked selection of video presentations from the TED conference -- because the next big thing has to start somewhere. Read more...
Aug 30 | CERN project adapts its computation and storage strategy as hardware gets cheaper and better. Read more...
Aug 26 | Chinese-made chip adds vector SIMD unit; delivers 128 gigaflops in 40 watts. Read more...
Jul 29 | | Panasas storage solutions deliver high throughput with many concurrent backup IO streams to standard backup applications such as Veritas NetBackup™ or EMC® NetWorker™. Download this whitepaper to understand the essential elements for effective backup and restore: the tape subsystem, networking, file system workload and administrative policy.
Jul 28 | | As compelling economics and performance drive GPUs into HPC clusters, developers are scrambling to catch up. Download this whitepaper from Platform Computing to understand how to capture the benefits of exciting new GPU capabilities.
In this webinar you will hear about the current storage challenges facing the HPC community, how Panasas storage solutions provide exceptional performance, scalability, and manageability, and how you can achieve the lowest total Cost of Ownership with a system that installs and configures in 15 minutes.
Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.
Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.