June 24, 2013

Lustre Founder Spots Haskell on HPC Horizon

Nicole Hemsoth

As a small 20-person company, Boulder-based Parallel Scientific flies just under the mainstream radar, but for those who have been in the HPC community for a number of years, its CEO and Chief Architect Peter Braam is a recognizable name.

Braam’s major contribution to high performance computing geared up in 1999 at Carnegie Mellon, where he worked on the file system architecture that spun into Lustre. Inspired by that progress, Braam kicked that project out as a business, beginning Cluster File Systems, which was acquired by Sun in 2007. A year later, Braam let the Sun set and embarked on a new adventure with Parallel Scientific.

The company has a rather interesting business model–instead of focusing on particular problem area (outside of the general purview of tailoring environments for large-scale HPC deployments) they are letting the breeze carry them. For instance, they have found themselves focusing on parallel Haskell in the last few years, even though Braam says that such development might just be the shell for something much larger or entirely different as they continue. 

The saying goes that in Haskell, the function is a first-class citizen–and this status might make it a solid fit for a range of high performance computing environments. Despite what Braam admits is a daunting learning curve, there is an open field of possibilities for Haskell to infiltrate HPC. As it stands, there is an active community around it and around 5000 open source and tools available. But the real value for high performance computing, he argues, lies in Haskell’s productivity and correctness–a worth that’s been validated in select industrial use cases.

Arguably, Google and Facebook have brought more attention to Haskell in recent years, but there are a number of other notable uses that highlight Braam’s confidence in the functional language. For instance, Chicago-based Allston Trading, a high frequency trading company, uses Haskell in their trading infrastructure. AT&T is using it in their Network Security group to automate internet abuse complaint processing. Bank of American is using it in their backend data transformation and loading system and Credit Suisse’s Global Modeling and Analytics Group has been using it since 2006 to improve modeler productivity and open access to those models across the organization.

Biotech giant Amgen also uses Haskell for math-heavy models and to “break developers out of their development rut by giving them a new way to think about software. According to the company’s David Balaban, “Our experience is that using functional programming reduces the critical conceptual distance between thought/algorithms design and code.” But the real value says Balaban is the level of correctness they’ve been able to achieve.

As Amgen’s Balaban says “we have been able to develop code quickly and verify–to an applied mathematician’s satisfaction–the correctness of Haskell code straightforwardly; we have yet to achieve this with more traditional mainstream languages.” 

And indeed, there are plenty of mainstream languages that would seem to fit the bill for mathematical models that don’t come with the hike. Braam says that while R might seem like the most practical language for users like Amgen and others noted above, there are opportunities for error in R that Haskell won’t allow. When correctness is key–as it is with all of the above use cases–the learning curve of Haskell is worth the price if total accuracy is inherent. In fact, he laughingly admits that their business could eventually turn to just building a language that is “safer” than R on the correctness front. 

When it comes to the uptake of a language like Haskell with its learning curve and relatively isolated set of non end user-based functionality, Braam says it’s a matter of time. He points to languages like Python, which took a decade or more to wind a path through large-scale environments, HPC and commercial alike. Still, he notes, “you’d be surprised how many companies have a ‘secret’ Haskell department–a group of people that are highly productive dedicated to solving a serious problem. This is especially true in the financial world–mostly because of the correctness element.”

And true to that point, in grazing the user community for Haskell, the trend seems to be that few are using Haskell as an end user tool. However, users want to be able to construct robust software infrastructure that lets users maintain creativity and take advantage of domain-specific languages, which he says are not as difficult as one might think to implement in Haskell. He points to the example of a Square Kilometer Array (SKA) research group, which wants to the right set of primitives built that will allow the hardware to change and run on different hybrid platforms–an area that his company can help with. 

The idea for finding a wider market for parallel Haskell spun from work Braam and colleagues did around an XSTACK proposal which sought programming environments for exascale computers. The team put an emphasis on automatic parallelization via compilers–a trick that Braam said plenty outside of the HPC community (Facebook, for example) has perfected. It just hasn’t caught on in HPC–although said the same Erlang and Haskell tricks that have worked at that scale can automate parallelism across a multitude of HPC systems. 

His team is also hard at work perfecting their “Awesome Haskell FPGA Compiler” whihc lets developers express hardware solutions in a high-level domain specific language in much the same way their work with SKA is allowing. The environment would allow software simulation and testing in an interactive environment to wick away the long development time that are the bane of FPGA design. The solution that can be kicked to the overall environment and ready to run in some unique data-intensive areas, like SKA.

Braam is a Haskell believer in the same way he believed in his pioneering work on Lustre. “People said you’re wasting your time,” he reminisced. While the use cases may be small, as big data and more complex models drive further into both research and enterprise settings, the appeal of a functional language that emphasizes correctness and productivity will reveal itself.