High performance computing continues to be one of the fastest growing segments of the IT industry. But there’s a general consensus that the commercial sector is still underutilizing HPC technologies. In some cases, users have deployed HPC but lack the most effective software or hardware for their application. In other cases, users haven’t even made the jump from technical computing workstations to HPC servers.
Over the past four years, the Council on Competitiveness has released a series of studies that point to a gap between HPC use and its potential. Taken as a whole, the reports highlight the most common barriers to greater HPC use: cost, especially software cost, and lack of HPC talent. Thanks to more open source software (and the competitive pricing it engenders) and ever-more powerful hardware, overall system costs are coming down. But the human expertise needed to specify, procure, configure, and operate an HPC set up is still a limiting factor for most businesses. This is especially true of smaller firms, where specialized IT talent can’t be justified.
Even though HPC now uses a lot of commodity solutions derived from open source software and off-the-shelf parts, there is a dizzying array of combinations to deal with. To make sure all the pieces fit together, the software stack — operating system, libraries, drivers, compilers, etc. — must be qualified and optimized for each unique hardware configuration. Even for an experienced system integrator, this is no small task.
It has usually been left to HPC vendors to help fill the expertise gap. Most of the larger OEMs, like IBM and HP who have deep partnerships across the vendor community, double as system integrators. Even most of the tier 2 OEMs offer some kind of integration capability. But that’s assuming the customer has already made it to the system procurement stage. What’s really needed is someone to give businesses a boost into the HPC ecosystem in the first place — but in a vendor neutral kind of way.
From what I could glean from a recent conversation with Gilad Shainer, director of technical marketing at Mellanox, that’s what their new HPC Advisory Council is all about. Shainer says his company formed the organization to help bring users and HPC technologies together. Specifically, their mission is to qualify and optimize HPC solutions, give early access to new technology, help develop new solutions through collaboration, find ways to push HPC into new markets, and help develop joint marketing programs.
To do this they’ve brought the whole cast of HPC characters into the mix: OEMS, chip makers, storage vendors, interconnect suppliers, end users, ISVs, and some key technical people. They’ve already lined up big industry names, including HP, Intel, Microsoft and more than 30 other commercial firms, while the end user members come mostly from research institutions such as national labs or universities. Their full membership is listed on their website, although Shainer says three or four additional groups joined up a few days after the announcement of the Advisory Council last week. Of course, Mellanox is a member too, as well as being the umbrella for the whole organization.
Why Mellanox? As the leading provider of InfiniBand switch silicon and multi-protocol adapters, the company’s products end up connecting a lot of the other pieces of the HPC ecosystem. You’ll find Mellanox switch silicon in everything from the IBM’s petaflop Roadrunner supercomputer to generic 10-node clusters. Also, according the Shainer, the company is perceived as a vendor-neutral technology provider. “Many people see Mellanox as the Switzerland of HPC,” he claims. While competitor QLogic and Ethernet switch vendors may not exactly see it that way, certainly Mellanox is in a rather unique position with respect to a lot of HPC providers.
The other part of the equation for Mellanox is that an organization like this can help drive InfiniBand adoption, a non-mainstream technology compared to the ubiquitous Ethernet. A large part of the challenge for HPC newbies is coming up with a software stack that makes InfiniBand interoperable with their application and hardware set up. Since most HPC vendors are now committed to InfiniBand as well, it’s in their interest to help users assemble a consistent software stack that works across a range of applications and hardware configurations. Also, because Mellanox is starting to expand its horizons beyond traditional HPC — into ultra-scale Web applications and enterprise virtualization — anything that makes InfiniBand more palatable to mainstream users will help ease the technology’s entry into the larger enterprise market.
One of the concrete steps the Advisory Council has already taken is the establishment of an HPC cluster center at Mellanox’ U.S. headquarters in Santa Clara, Calif. They’ve gathered equipment and software from council members and constructed a number of AMD- and Intel-based HPC systems for application testing and benchmarking. Users can request time on a cluster, free of charge, and log in remotely to try out their software or even use it for application development. ISVs can also use the systems for software development and testing. Although membership to the Advisory Council is free, members are obligated to support the cluster center. According to Shainer, there are five systems (complete with attached storage and files systems) currently running at the facility, with more on the way.
Another piece of the Advisory Council’s work is the “network of knowledge,” a mailing list that can be accessed by end users to get technical support. For example, a prospective customer may have a particular application to host and wants to know the types of systems that would be most appropriate. Or a current user is experiencing a storage bottleneck with their set up and is looking for a solution. The mailing list is sent out to a community of technical experts, which are provided by each participating Advisory Council member (and even some from non-participating members). Obviously, vendors are interested in attracting new customers and keeping the current ones happy, so participation in the mailing list seems to have some built-in motivation.
The second part of the Advisory Council’s work is to engage in outreach. Here they’re specifically targeting technical computing users who could move from workstations up to HPC clusters. As mentioned before, because of the lack of HPC expertise, a lot of these users are unable to make the leap from the desktop to the HPC server on their own. This may even be the case where small businesses are supplying components to larger firms that use HPC systematically. For example, an aerospace company may be designing aircraft with HPC simulation, but the widgets that go into it are being developed on PCs.
The Advisory Council’s approach will be to use member ISVs like Wolfram Research (makers of Mathematica) to reach into their desktop user community and show users a path to HPC server-based systems. This includes providing a recipe for scaling end user applications to parallel platforms and hands-on support. In the past, ISVs may have been motivated to move customers up the food chain on their own, but without support from the larger HPC ecosystem, it was difficult for the software vendors to do this.
Between the OEMs, the ISVs, component makers, and other members, Mellanox seems to have assembled a critical mass of people, although it remains to be seen whether the Advisory Council represents the right formula to catalyze HPC adoption. Multi-vendor partnerships get announced all the time, and most don’t amount to much in the long term. But if Mellanox has managed to pull together an effective advisory group, it will be yet another “interconnect” success story for the company.