HPC Startup Takes a Shine to Lustre

By Michael Feldman

July 29, 2010

Lustre, the much-beloved open-source file system technology used by many of the top supercomputers in the world, has a new friend. Actually a whole new company. Whamcloud, a venture-funded startup based in upscale Danville, California, came out of hiding on Wednesday and announced its intentions to help carry the Lustre torch forward on Linux.

Right now Lustre could use a champion. The technology has been passed around a lot since it was originally developed in 1999 by Peter Braam at Carnegie Mellon University. Braam later founded Cluster File Systems (CFS), which released Lustre 1.0 in 2003. Sun Microsystems acquired the technology, along with the CFS engineers in 2007. Of course, by then, Sun was a sinking ship, leading to Oracle’s acquisition of the company in 2010, with Lustre in tow.

That’s when the HPC community started getting nervous. Oracle was never an HPC organization, and from all outward signs (or lack thereof), is not likely to become one. The company has apparently maintained a Lustre team, however, and plans (PDF) to continue hosting the software for the open source Lustre community. But paid support for Lustre 2.0 will be limited to Oracle systems only. Worse yet, it looks like ZFS (an advanced 128-bit file system developed by Sun) will not be ported to Linux, leaving Lustre to rely on the OS’s less-capable extended (ext) file system technology.
 
Enter Whamcloud. The company intends to step into the void left by Oracle and advance the Lustre technology for high performance computing, giving some hope that the file system technology has a viable future in supercomputing — and perhaps elsewhere. “High performance computing is suffering a little bit right now,” says Whamcloud CEO Brent Gorda. “There are always performance bottlenecks everywhere, but the file system is a critical one that is the Achilles Heel in many cases.”

I got a chance to talk with the new CEO about the company’s plans and his expectations for the business. Gorda, who up until a couple of weeks ago was Deputy for Advanced Technology Projects at the Lawrence Livermore National Laboratory (LLNL), has managed to attract a couple of other well-known Lustre true-believers to the Whamcloud venture. Eric Barton, a lead engineer on the Lustre group at Oracle, is now Whamcloud’s CTO; and Robert Read, who lead the Lustre 2.0 project at Oracle, has signed on as the principal engineer.

According to Gorda, Whamcloud’s near-term plans are to take the lead in developing the Lustre code base for the Linux platform. His experience at LLNL, an early adopter and support of Lustre, should come in handy in this regard. The big machines at many Department of Energy (DOE) and supercomputing centers enthusiastically employ the open-source file system today. Currently, Lustre is used in 15 of the top 30 supercomputers in the world, and about half of all the top 500 systems. Because of the file system’s popularity at the DOE and NSF centers, Gorda believes they will be able to do contract Lustre work for the government labs, who are committed to using the technology on their big supercomputers — at least for the foreseeable future.

Gorda believes the software they intend to develop can live peaceably with the rest of the Lustre code that Oracle is developing for its commercial needs. He says they have no intention of forking the Lustre code base, and does not want to get into a wrestling match with Oracle (and would discourage anyone else from doing this either). “We will absolutely cooperate with Oracle and will do the development in such a way that it is beneficial to them and what they want to use Lustre for,” says Gorda. “But we want to make sure that any such development that we do will be in support of high performance computing.”

One immediate problem that Gorda thinks the HPC-Lustre community needs to focus on is the replacement of ZFS (which will come to Lustre, but on Solaris and not Linux). The HPC community was rallying around ZFS since it represented the next-generation files systems technology, offering advanced features like end-to-end data integrity and software RAID. That capability is not available on Linux’s ext technology, even on the latest ext3 and ext4 file systems.

Further out, the Lustre technology will need to segue into exascale computing. Whamcloud won’t be able to do that alone, however. Scaling file system and I/O technology to exascale will take a concerted effort by the whole community. Gorda concedes that parallel file system technology for that level of computing may not be even be recognizable as Lustre in 10 years. But he is adamant that the community will want an open source solution, and Lustre is the best starting point available.

The other aspect to Whamcloud is implied in its name. Gorda believes Lustre (and parallel file system technology, in general) has significant application to cloud computing. From his perspective, the cloud is another kind of high-end computing platform that has a strong resemblance to high performance computing, especially in its needs for a scalable file system. Gorda admits the company’s strategy is not completely fleshed out yet in regard to this area (he’s only been the CEO for a week), but they have already had some discussions with a few cloud providers to get the ball rolling.

In the meantime, Whamcloud intends to add more staff and build a credible team for the kind of work the company has in its sights. So far, the startup has collected $10 million in venture capital to get the business off the ground, and probably wouldn’t mind attracting some additional funding. “We’re very adamant that the community needs to keep using this technology,” says Gorda, “as well as whatever comes after it.”

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

NSF Project Sets Up First Machine Learning Cyberinfrastructure – CHASE-CI

July 25, 2017

Earlier this month, the National Science Foundation issued a $1 million grant to Larry Smarr, director of Calit2, and a group of his colleagues to create a community infrastructure in support of machine learning research Read more…

By John Russell

DARPA Continues Investment in Post-Moore’s Technologies

July 24, 2017

The U.S. military long ago ceded dominance in electronics innovation to Silicon Valley, the DoD-backed powerhouse that has driven microelectronic generation for decades. With Moore's Law clearly running out of steam, the Read more…

By George Leopold

Graphcore Readies Launch of 16nm Colossus-IPU Chip

July 20, 2017

A second $30 million funding round for U.K. AI chip developer Graphcore sets up the company to go to market with its “intelligent processing unit” (IPU) in 2017 with scale-up production for enterprise datacenters and Read more…

By Tiffany Trader

HPE Extreme Performance Solutions

HPE Servers Deliver High Performance Remote Visualization

Whether generating seismic simulations, locating new productive oil reservoirs, or constructing complex models of the earth’s subsurface, energy, oil, and gas (EO&G) is a highly data-driven industry. Read more…

Trinity Supercomputer’s Haswell and KNL Partitions Are Merged

July 19, 2017

Trinity supercomputer’s two partitions – one based on Intel Xeon Haswell processors and the other on Xeon Phi Knights Landing – have been fully integrated are now available for use on classified work in the Nationa Read more…

By HPCwire Staff

NSF Project Sets Up First Machine Learning Cyberinfrastructure – CHASE-CI

July 25, 2017

Earlier this month, the National Science Foundation issued a $1 million grant to Larry Smarr, director of Calit2, and a group of his colleagues to create a comm Read more…

By John Russell

Graphcore Readies Launch of 16nm Colossus-IPU Chip

July 20, 2017

A second $30 million funding round for U.K. AI chip developer Graphcore sets up the company to go to market with its “intelligent processing unit” (IPU) in Read more…

By Tiffany Trader

Fujitsu Continues HPC, AI Push

July 19, 2017

Summer is well under way, but the so-called summertime slowdown, linked with hot temperatures and longer vacations, does not seem to have impacted Fujitsu's out Read more…

By Tiffany Trader

Researchers Use DNA to Store and Retrieve Digital Movie

July 18, 2017

From abacus to pencil and paper to semiconductor chips, the technology of computing has always been an ever-changing target. The human brain is probably the com Read more…

By John Russell

The Exascale FY18 Budget – The Next Step

July 17, 2017

On July 12, 2017, the U.S. federal budget for its Exascale Computing Initiative (ECI) took its next step forward. On that day, the full Appropriations Committee Read more…

By Alex R. Larzelere

Women in HPC Luncheon Shines Light on Female-Friendly Hiring Practices

July 13, 2017

The second annual Women in HPC luncheon was held on June 20, 2017, during the International Supercomputing Conference in Frankfurt, Germany. The luncheon provid Read more…

By Tiffany Trader

Satellite Advances, NSF Computation Power Rapid Mapping of Earth’s Surface

July 13, 2017

New satellite technologies have completely changed the game in mapping and geographical data gathering, reducing costs and placing a new emphasis on time series Read more…

By Ken Chiacchia and Tiffany Jolley

Intel Skylake: Xeon Goes from Chip to Platform

July 13, 2017

With yesterday’s New York unveiling of the new “Skylake” Xeon Scalable processors, Intel made multiple runs at multiple competitive threats and strategic Read more…

By Doug Black

Google Pulls Back the Covers on Its First Machine Learning Chip

April 6, 2017

This week Google released a report detailing the design and performance characteristics of the Tensor Processing Unit (TPU), its custom ASIC for the inference Read more…

By Tiffany Trader

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

Quantum Bits: D-Wave and VW; Google Quantum Lab; IBM Expands Access

March 21, 2017

For a technology that’s usually characterized as far off and in a distant galaxy, quantum computing has been steadily picking up steam. Just how close real-wo Read more…

By John Russell

HPC Compiler Company PathScale Seeks Life Raft

March 23, 2017

HPCwire has learned that HPC compiler company PathScale has fallen on difficult times and is asking the community for help or actively seeking a buyer for its a Read more…

By Tiffany Trader

Trump Budget Targets NIH, DOE, and EPA; No Mention of NSF

March 16, 2017

President Trump’s proposed U.S. fiscal 2018 budget issued today sharply cuts science spending while bolstering military spending as he promised during the cam Read more…

By John Russell

CPU-based Visualization Positions for Exascale Supercomputing

March 16, 2017

In this contributed perspective piece, Intel’s Jim Jeffers makes the case that CPU-based visualization is now widely adopted and as such is no longer a contrarian view, but is rather an exascale requirement. Read more…

By Jim Jeffers, Principal Engineer and Engineering Leader, Intel

Nvidia’s Mammoth Volta GPU Aims High for AI, HPC

May 10, 2017

At Nvidia's GPU Technology Conference (GTC17) in San Jose, Calif., this morning, CEO Jensen Huang announced the company's much-anticipated Volta architecture a Read more…

By Tiffany Trader

Facebook Open Sources Caffe2; Nvidia, Intel Rush to Optimize

April 18, 2017

From its F8 developer conference in San Jose, Calif., today, Facebook announced Caffe2, a new open-source, cross-platform framework for deep learning. Caffe2 is the successor to Caffe, the deep learning framework developed by Berkeley AI Research and community contributors. Read more…

By Tiffany Trader

Leading Solution Providers

How ‘Knights Mill’ Gets Its Deep Learning Flops

June 22, 2017

Intel, the subject of much speculation regarding the delayed, rewritten or potentially canceled “Aurora” contract (the Argonne Lab part of the CORAL “ Read more…

By Tiffany Trader

Reinders: “AVX-512 May Be a Hidden Gem” in Intel Xeon Scalable Processors

June 29, 2017

Imagine if we could use vector processing on something other than just floating point problems.  Today, GPUs and CPUs work tirelessly to accelerate algorithms Read more…

By James Reinders

Russian Researchers Claim First Quantum-Safe Blockchain

May 25, 2017

The Russian Quantum Center today announced it has overcome the threat of quantum cryptography by creating the first quantum-safe blockchain, securing cryptocurrencies like Bitcoin, along with classified government communications and other sensitive digital transfers. Read more…

By Doug Black

MIT Mathematician Spins Up 220,000-Core Google Compute Cluster

April 21, 2017

On Thursday, Google announced that MIT math professor and computational number theorist Andrew V. Sutherland had set a record for the largest Google Compute Engine (GCE) job. Sutherland ran the massive mathematics workload on 220,000 GCE cores using preemptible virtual machine instances. Read more…

By Tiffany Trader

Google Debuts TPU v2 and will Add to Google Cloud

May 25, 2017

Not long after stirring attention in the deep learning/AI community by revealing the details of its Tensor Processing Unit (TPU), Google last week announced the Read more…

By John Russell

Groq This: New AI Chips to Give GPUs a Run for Deep Learning Money

April 24, 2017

CPUs and GPUs, move over. Thanks to recent revelations surrounding Google’s new Tensor Processing Unit (TPU), the computing world appears to be on the cusp of Read more…

By Alex Woodie

Six Exascale PathForward Vendors Selected; DoE Providing $258M

June 15, 2017

The much-anticipated PathForward awards for hardware R&D in support of the Exascale Computing Project were announced today with six vendors selected – AMD Read more…

By John Russell

Top500 Results: Latest List Trends and What’s in Store

June 19, 2017

Greetings from Frankfurt and the 2017 International Supercomputing Conference where the latest Top500 list has just been revealed. Although there were no major Read more…

By Tiffany Trader

  • arrow
  • Click Here for More Headlines
  • arrow
Share This