Like many other storage companies with roots in HPC, Panasas is leveraging its history in some of the most demanding environments to bridge the technical to commercial computing divide.
According to the company’s Geoffrey Noer, just three years ago, most Panasas customers were in traditional HPC, scattered across a wide number of users in academia and government. That HPC to enterprise leap happened naturally for them, he says, as hybrid scale-out NAS has taken root in more commercial HPC environments where current legacy-based approaches are increasingly overextended and difficult to manage.
Among the new commercial HPC and analytics users Panasas has managed to capture are companies in aerospace, life sciences, and media/entertainment. “These are usually design and simulation workflows,” says Noer, “which by definition is HPC, but it’s for enterprise customers.” These newer users for Panasas are seeking to overcome critical barriers that a truly scale-out architecture can provide, and now, with the today’s release of their updated storage operating system, PanFS 5.5 release, they can provide a single namespace to let users in Windows-heavy enterprise shops tap into Windows and Linux seamlessly. This Microsoft tie-in is the result of two years of development to get the two to play nicely together within their storage environment and to ensure continued certification through Microsoft’s Communication Protocol Program. Such development sounds rather expensive, but Panasas says that there are no plans to change pricing to reflect the extra Microsoft hoop-jumping.
According to Noer, the lengthy process through Microsoft’s channels will be useful for both their traditional HPC center users and the enterprise customers they’re seeking to reach. “If you look at a large cluster, it’s running Linux for the ultra high performance part, but it you look at what an engineer or researcher is running on a workstation, or they’re working with multiple applications on Windows or Linux, this becomes very important.”
Panasas is being realistic about the performance issues related to Windows for commercial HPC customers, noting that even with this PanFS 5.5 update with new windows open, the highest performance workflows stay in a Linux environment. “The current Windows protocol can’t hit the performance levels of our DirectFlow protocol in Linux, but that’s inherent to the protocol itself,” said Noer. The key is that the interoperability is “enterprise-grade” which to Panasas, means that the the handshaking between the Active Directory and the storage system to keep track of users and groups has to be seamless and up to Microsoft standards.
These added Windows to new opportunities are open wider with a scale-out NAS approach that does some interesting things between leveraging SATA and SSDs for the purposes they were designed for via ActiveStor 14, their latest integrated hardware update.
The key to what Panasas is doing on the macro level (with ActiveStor and PanFS in harmony) is taking advantage of an architecture that Noer says was “designed from the ground up for technical computing workloads,” since Panasas was never “hinged on adapting to a legacy architecture.”He points to the widely-used NetApp approach in commercial environments as an example of this legacy problem, pointing to the way that users are pushed into adding filer heads to push performance. While this may work, what users end up with are several storage pools that are difficult to manage. “It’s hard for users to get off that architecture and onto one that’s truly scaleout.”
The goal is to give users a platform that’s free from file server lag or hardware RAID showdowns by instead offering distributed elements that the IO is balanced over, which is managed with DirectFlow. This protocol lets users read and write in parallel across all those different elements instead of using older point-to-point protocols that scale simply by adding more clients.
The other key to what Panasas is doing is by taking metadata requests off the same data path as the large read/writes and cooking them directly onto the “Director” blades, which manage those requests while the real grunt work is saved for the storage blades that handle the big read/write demands. The goal is to allow users to scale their metadata performance separately to avoid that I/O conflict. This isn’t entirely new—Lustre and GPFS manage things essentially the same way, but the difference here, at least according to Noer, is the orchestration at the PanFS level.
“When you look at the HPC space, you had software-only file systems that could provide great performance, but the kind of reliability, high availability and manageability of something fully integrated. Then you also have the top tier storage vendors who don’t have the performance levels needed for HPC, even if they’re able to provide the enterprise-grade features. We’re trying to do all of that in one place,” says Noer.