HPC Startup Takes a Shine to Lustre

By Michael Feldman

July 29, 2010

Lustre, the much-beloved open-source file system technology used by many of the top supercomputers in the world, has a new friend. Actually a whole new company. Whamcloud, a venture-funded startup based in upscale Danville, California, came out of hiding on Wednesday and announced its intentions to help carry the Lustre torch forward on Linux.

Right now Lustre could use a champion. The technology has been passed around a lot since it was originally developed in 1999 by Peter Braam at Carnegie Mellon University. Braam later founded Cluster File Systems (CFS), which released Lustre 1.0 in 2003. Sun Microsystems acquired the technology, along with the CFS engineers in 2007. Of course, by then, Sun was a sinking ship, leading to Oracle’s acquisition of the company in 2010, with Lustre in tow.

That’s when the HPC community started getting nervous. Oracle was never an HPC organization, and from all outward signs (or lack thereof), is not likely to become one. The company has apparently maintained a Lustre team, however, and plans (PDF) to continue hosting the software for the open source Lustre community. But paid support for Lustre 2.0 will be limited to Oracle systems only. Worse yet, it looks like ZFS (an advanced 128-bit file system developed by Sun) will not be ported to Linux, leaving Lustre to rely on the OS’s less-capable extended (ext) file system technology.
 
Enter Whamcloud. The company intends to step into the void left by Oracle and advance the Lustre technology for high performance computing, giving some hope that the file system technology has a viable future in supercomputing — and perhaps elsewhere. “High performance computing is suffering a little bit right now,” says Whamcloud CEO Brent Gorda. “There are always performance bottlenecks everywhere, but the file system is a critical one that is the Achilles Heel in many cases.”

I got a chance to talk with the new CEO about the company’s plans and his expectations for the business. Gorda, who up until a couple of weeks ago was Deputy for Advanced Technology Projects at the Lawrence Livermore National Laboratory (LLNL), has managed to attract a couple of other well-known Lustre true-believers to the Whamcloud venture. Eric Barton, a lead engineer on the Lustre group at Oracle, is now Whamcloud’s CTO; and Robert Read, who lead the Lustre 2.0 project at Oracle, has signed on as the principal engineer.

According to Gorda, Whamcloud’s near-term plans are to take the lead in developing the Lustre code base for the Linux platform. His experience at LLNL, an early adopter and support of Lustre, should come in handy in this regard. The big machines at many Department of Energy (DOE) and supercomputing centers enthusiastically employ the open-source file system today. Currently, Lustre is used in 15 of the top 30 supercomputers in the world, and about half of all the top 500 systems. Because of the file system’s popularity at the DOE and NSF centers, Gorda believes they will be able to do contract Lustre work for the government labs, who are committed to using the technology on their big supercomputers — at least for the foreseeable future.

Gorda believes the software they intend to develop can live peaceably with the rest of the Lustre code that Oracle is developing for its commercial needs. He says they have no intention of forking the Lustre code base, and does not want to get into a wrestling match with Oracle (and would discourage anyone else from doing this either). “We will absolutely cooperate with Oracle and will do the development in such a way that it is beneficial to them and what they want to use Lustre for,” says Gorda. “But we want to make sure that any such development that we do will be in support of high performance computing.”

One immediate problem that Gorda thinks the HPC-Lustre community needs to focus on is the replacement of ZFS (which will come to Lustre, but on Solaris and not Linux). The HPC community was rallying around ZFS since it represented the next-generation files systems technology, offering advanced features like end-to-end data integrity and software RAID. That capability is not available on Linux’s ext technology, even on the latest ext3 and ext4 file systems.

Further out, the Lustre technology will need to segue into exascale computing. Whamcloud won’t be able to do that alone, however. Scaling file system and I/O technology to exascale will take a concerted effort by the whole community. Gorda concedes that parallel file system technology for that level of computing may not be even be recognizable as Lustre in 10 years. But he is adamant that the community will want an open source solution, and Lustre is the best starting point available.

The other aspect to Whamcloud is implied in its name. Gorda believes Lustre (and parallel file system technology, in general) has significant application to cloud computing. From his perspective, the cloud is another kind of high-end computing platform that has a strong resemblance to high performance computing, especially in its needs for a scalable file system. Gorda admits the company’s strategy is not completely fleshed out yet in regard to this area (he’s only been the CEO for a week), but they have already had some discussions with a few cloud providers to get the ball rolling.

In the meantime, Whamcloud intends to add more staff and build a credible team for the kind of work the company has in its sights. So far, the startup has collected $10 million in venture capital to get the business off the ground, and probably wouldn’t mind attracting some additional funding. “We’re very adamant that the community needs to keep using this technology,” says Gorda, “as well as whatever comes after it.”

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

IBM Quantum Summit: Osprey Flies; Error Handling Progress; Quantum-centric Supercomputing

December 1, 2022

Part scorecard, part grand vision, IBM’s annual Quantum Summit held last month is a fascinating snapshot of IBM’s progress, evolving technology roadmap, and issues facing the quantum landscape broadly. Thankfully, IB Read more…

AWS Introduces a Flurry of New EC2 Instances at re:Invent

November 30, 2022

AWS has announced three new Amazon Elastic Compute Cloud (Amazon EC2) instances powered by AWS-designed chips, as well as several new Intel-powered instances – including ones targeting HPC – at its AWS re:Invent 2022 Read more…

Quantum Riches and Hardware Diversity Are Discouraging Collaboration

November 28, 2022

Quantum computing is viewed as a technology for generations, and the spoils for the winners are huge, but the diversity of technology is discouraging collaboration, an Intel executive said last week. There are close t Read more…

2022 Road Trip: NASA Ames Takes Off

November 25, 2022

I left Dallas very early Friday morning after the conclusion of SC22. I had a race with the devil to get from Dallas to Mountain View, Calif., by Sunday. According to Google Maps, this 1,957 mile jaunt would be the longe Read more…

2022 Road Trip: Sandia Brain Trust Sounds Off

November 24, 2022

As the 2022 Great American Supercomputing Road Trip carries on, it’s Sandia’s turn. It was a bright sunny day when I rolled into Albuquerque after a high-speed run from Los Alamos National Laboratory. My interview su Read more…

AWS Solution Channel

Shutterstock 110419589

Thank you for visiting AWS at SC22

Accelerate high performance computing (HPC) solutions with AWS. We make extreme-scale compute possible so that you can solve some of the world’s toughest environmental, social, health, and scientific challenges. Read more…

 

shutterstock_1431394361

AI and the need for purpose-built cloud infrastructure

Modern AI solutions augment human understanding, preferences, intent, and even spoken language. AI improves our knowledge and understanding by delivering faster, more informed insights that fuel transformation beyond anything previously imagined. Read more…

2022 HPC Road Trip: Los Alamos

November 23, 2022

With SC22 in the rearview mirror, it’s time to get back to the 2022 Great American Supercomputing Road Trip. To refresh everyone’s memory, I jumped in the car on November 3rd and headed towards SC22 in Dallas, stoppi Read more…

IBM Quantum Summit: Osprey Flies; Error Handling Progress; Quantum-centric Supercomputing

December 1, 2022

Part scorecard, part grand vision, IBM’s annual Quantum Summit held last month is a fascinating snapshot of IBM’s progress, evolving technology roadmap, and Read more…

AWS Introduces a Flurry of New EC2 Instances at re:Invent

November 30, 2022

AWS has announced three new Amazon Elastic Compute Cloud (Amazon EC2) instances powered by AWS-designed chips, as well as several new Intel-powered instances Read more…

Quantum Riches and Hardware Diversity Are Discouraging Collaboration

November 28, 2022

Quantum computing is viewed as a technology for generations, and the spoils for the winners are huge, but the diversity of technology is discouraging collaborat Read more…

2022 HPC Road Trip: Los Alamos

November 23, 2022

With SC22 in the rearview mirror, it’s time to get back to the 2022 Great American Supercomputing Road Trip. To refresh everyone’s memory, I jumped in the c Read more…

QuEra’s Quest: Build a Flexible Neutral Atom-based Quantum Computer

November 23, 2022

Last month, QuEra Computing began providing access to its 256-qubit, neutral atom-based quantum system, Aquila, from Amazon Braket. Founded in 2018, and built o Read more…

SC22’s ‘HPC Accelerates’ Plenary Stresses Need for Collaboration

November 21, 2022

Every year, SC has a theme. For SC22 – held last week in Dallas – it was “HPC Accelerates”: a theme that conference chair Candace Culhane said reflected Read more…

Quantum – Are We There (or Close) Yet? No, Says the Panel

November 19, 2022

For all of its politeness, a fascinating panel on the last day of SC22 – Quantum Computing: A Future for HPC Acceleration? – mostly served to illustrate the Read more…

RISC-V Is Far from Being an Alternative to x86 and Arm in HPC

November 18, 2022

One of the original RISC-V designers this week boldly predicted that the open architecture will surpass rival chip architectures in performance. "The prediction is two or three years we'll be surpassing your architectures and available performance with... Read more…

Nvidia Shuts Out RISC-V Software Support for GPUs 

September 23, 2022

Nvidia is not interested in bringing software support to its GPUs for the RISC-V architecture despite being an early adopter of the open-source technology in its GPU controllers. Nvidia has no plans to add RISC-V support for CUDA, which is the proprietary GPU software platform, a company representative... Read more…

RISC-V Is Far from Being an Alternative to x86 and Arm in HPC

November 18, 2022

One of the original RISC-V designers this week boldly predicted that the open architecture will surpass rival chip architectures in performance. "The prediction is two or three years we'll be surpassing your architectures and available performance with... Read more…

AWS Takes the Short and Long View of Quantum Computing

August 30, 2022

It is perhaps not surprising that the big cloud providers – a poor term really – have jumped into quantum computing. Amazon, Microsoft Azure, Google, and th Read more…

Chinese Startup Biren Details BR100 GPU

August 22, 2022

Amid the high-performance GPU turf tussle between AMD and Nvidia (and soon, Intel), a new, China-based player is emerging: Biren Technology, founded in 2019 and headquartered in Shanghai. At Hot Chips 34, Biren co-founder and president Lingjie Xu and Biren CTO Mike Hong took the (virtual) stage to detail the company’s inaugural product: the Biren BR100 general-purpose GPU (GPGPU). “It is my honor to present... Read more…

AMD Thrives in Servers amid Intel Restructuring, Layoffs

November 12, 2022

Chipmakers regularly indulge in a game of brinkmanship, with an example being Intel and AMD trying to upstage one another with server chip launches this week. But each of those companies are in different positions, with AMD playing its traditional role of a scrappy underdog trying to unseat the behemoth Intel... Read more…

Tesla Bulks Up Its GPU-Powered AI Super – Is Dojo Next?

August 16, 2022

Tesla has revealed that its biggest in-house AI supercomputer – which we wrote about last year – now has a total of 7,360 A100 GPUs, a nearly 28 percent uplift from its previous total of 5,760 GPUs. That’s enough GPU oomph for a top seven spot on the Top500, although the tech company best known for its electric vehicles has not publicly benchmarked the system. If it had, it would... Read more…

JPMorgan Chase Bets Big on Quantum Computing

October 12, 2022

Most talk about quantum computing today, at least in HPC circles, focuses on advancing technology and the hurdles that remain. There are plenty of the latter. F Read more…

Using Exascale Supercomputers to Make Clean Fusion Energy Possible

September 2, 2022

Fusion, the nuclear reaction that powers the Sun and the stars, has incredible potential as a source of safe, carbon-free and essentially limitless energy. But Read more…

Leading Solution Providers

Contributors

UCIe Consortium Incorporates, Nvidia and Alibaba Round Out Board

August 2, 2022

The Universal Chiplet Interconnect Express (UCIe) consortium is moving ahead with its effort to standardize a universal interconnect at the package level. The c Read more…

Nvidia, Qualcomm Shine in MLPerf Inference; Intel’s Sapphire Rapids Makes an Appearance.

September 8, 2022

The steady maturation of MLCommons/MLPerf as an AI benchmarking tool was apparent in today’s release of MLPerf v2.1 Inference results. Twenty-one organization Read more…

SC22 Unveils ACM Gordon Bell Prize Finalists

August 12, 2022

Courtesy of the schedule for the SC22 conference, we now have our first glimpse at the finalists for this year’s coveted Gordon Bell Prize. The Gordon Bell Pr Read more…

Not Just Cash for Chips – The New Chips and Science Act Boosts NSF, DOE, NIST

August 3, 2022

After two-plus years of contentious debate, several different names, and final passage by the House (243-187) and Senate (64-33) last week, the Chips and Science Act will soon become law. Besides the $54.2 billion provided to boost US-based chip manufacturing, the act reshapes US science policy in meaningful ways. NSF’s proposed budget... Read more…

Intel Is Opening up Its Chip Factories to Academia

October 6, 2022

Intel is opening up its fabs for academic institutions so researchers can get their hands on physical versions of its chips, with the end goal of boosting semic Read more…

AMD’s Genoa CPUs Offer Up to 96 5nm Cores Across 12 Chiplets

November 10, 2022

AMD’s fourth-generation Epyc processor line has arrived, starting with the “general-purpose” architecture, called “Genoa,” the successor to third-gen Eypc Milan, which debuted in March of last year. At a launch event held today in San Francisco, AMD announced the general availability of the latest Epyc CPUs with up to 96 TSMC 5nm Zen 4 cores... Read more…

AMD Previews 400 Gig Adaptive SmartNIC SOC at Hot Chips

August 24, 2022

Fresh from finalizing its acquisitions of FPGA provider Xilinx (Feb. 2022) and DPU provider Pensando (May 2022) ), AMD previewed what it calls a 400 Gig Adaptive smartNIC SOC yesterday at Hot Chips. It is another contender in the increasingly crowded and blurry smartNIC/DPU space where distinguishing between the two isn’t always easy. The motivation for these device types... Read more…

Google Program to Free Chips Boosts University Semiconductor Design

August 11, 2022

A Google-led program to design and manufacture chips for free is becoming popular among researchers and computer enthusiasts. The search giant's open silicon program is providing the tools for anyone to design chips, which then get manufactured. Google foots the entire bill, from a chip's conception to delivery of the final product in a user's hand. Google's... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire