Panasas Offers Petascale NAS
HPC storage vendor Panasas has launched PAS HC, a high capacity HPC storage solution for active archives and massive primary storage. The offering is aimed at organizations whose active data is expanding into the petabyte realm, outgrowing their traditional NAS or direct attached storage setups.
PAS HC is just the latest offering in Panasas’ NAS storage line. PAS, by the way, is the company’s new shorthand for all the Panasas ActiveStor lines, although, in this case, the architecture of PAS HC is quite different from that of the existing PAS 7, 8, and 9 series products. The latter are blade-based servers that use SATA drives, while the HC is implemented as a rackmount Nehalem-based storage system filled with Fibre Channel JBODs — a first for Panasas.
The glue that ties all the PAS’es together is Panasas’ home-grown parallel file system, PanFS. The idea is to abstract file access from the underlying hardware, and, perhaps more importantly, from the function of the different storage pools. With PanFS, a single global namespace can be created that spans multiple Panasas storage silos. So, for example, you can have very high performance storage for scratch space, robust NFS storage for home directories, and low-cost, scalable storage for archives, all under one file system.
“It’s a single mount point,” explains Larry Jones, vice president of marketing at Panasas. “All you have to do to move between a scratch space where you’re running a particular experiment and the archives where you have last year’s experiments is to change directories.”
In the past, less active data (like the raw data for a simulation) often got dumped to tape after it was initially processed. But workflows are often circular rather than linear, so for many applications, the raw data need to be continually accessible. A lot of technical computing projects have gotten so large that it’s not feasible to restore from tape every time a user wants to re-crunch the numbers. For petascale-sized datasets, this could take days or even weeks. Dumping data to cheap disk-based RAID devices is no solution either. The capacity of such systems is too small to act as a unified pool, so users end up creating multiple archives as disks fill up.
For example, an oil and gas company might process 100-300 terabytes of seismic data, and turn it into 20 to 40 TB that is subsequently funneled into seismic interpreters. This process is repeated for multiple seismic datasets. Ideally, what these companies want to do is keep both the raw and intermediate data around so they have the flexibility to rerun the models with different parameters. In workflows like this, the advantage of having all the data under a global namespace becomes apparent.
Panasas says the PAS HC solves the too-slow and too-fragmented dilemmas by marrying PanFS and ActiveStor-level performance with lower-cost and denser capacity of JBODs. A single HC RAID controller can deliver 5 GB/sec while reading and 3 GB/sec while writing. (Those numbers reflect file I/O, not block I/O.) Each 4U JBOD enclosure has room for 60 drives. At 2 TB per drive, that works out to 120 TB per shelf and a maximum of 960 TB per rack.
Because of the storage density, Panasas says it can offer the an HC setup at about a dollar per GB, or even less when discounted through a channel partner. That’s certainly a lot more expensive than tape storage, but for a high-performance disk-based set-up, fronted by a parallel file system, it’s relatively cheap.
Panasas has already sold a few PAS HC systems. One is destined for an oil and gas firm and another for an unnamed government agency. The only public deployment at this date is the system being deployed at Los Alamos National Laboratory, to support its nuclear security mission.
The PAS HC at Los Alamos will be used in conjunction with the Cray “Cielo” Baker-class super being installed in the second half of 2010. According to Jones, though, 6 PB of HC storage is already in place, with plans to expand to 12 PB. Performance will top out at 160 GB/sec of aggregate throughput. The new system is intended for active archiving, primary storage, as well as a scratch file area for checkpointing Cielo runs. The Panasas gear joins 2 PB of other ActiveStor storage (mostly PAS 8) at the lab.
The PAS HC release comes on the heels of a good year, revenue-wise, for Panasas. Despite the recession, the company recorded 25 percent year-over year growth in 2009, with Q4 being the best quarter in the company’s history. The first quarter of 2010 started with 46 percent year-over-year growth. With storage demand seemingly unsatiable, Panasas is hoping PAS HC keeps that momentum going.