Lustre Gets Backing of Non-Profit Corporation

By Michael Feldman

October 21, 2010

Some of the most prominent organizations in the HPC community have joined together to bootstrap a non-profit corporation devoted to scalable file system technologies. On Tuesday, Cray, Data Direct Networks, Lawrence Livermore National Laboratory (LLNL) and Oak Ridge National Laboratory (ORNL) announced the incorporation of Open Scalable File Systems, Inc. (OpenSFS). The newly-hatched group has cast itself as the focal point for development of Lustre and other open source file system technologies aimed at high performance computing.

According to OpenSFS CEO Norman Morse, the organization’s mission is to bring together the stakeholders for high-end scalable file systems and provide a formal structure for moving the associated software forward. Today that effort will focus on Lustre, the open source parallel file system that grew up in HPC. The Lustre source repository is currently in the hands of Oracle, who inherited the technology when it acquired Sun Microsystems, (who itself had acquired Lustre a year before it got swallowed up). Since Oracle will focus post-Lustre 2.0 development on OpenSolaris and its own database products, Linux-based Lustre for HPC has been left to a disparate group of vendors, research labs, and academic institutions who have a common need to see the technology move forward.

OpenSFS’ role will be to gather requirements from HPC stakeholders, prioritize them, and then fund the efforts to implement them. “We’ll develop feature sets that are important for the entire community, within the context of OpenSFS, and then over time those feature sets will make their way back into the canonical Lustre release,” explains Galen Shipman, group leader of technology integration at Oak Ridge National Laboratory and OpenSFS board member.

That model is pretty much the same as before, prior to Oracle’s control of the Lustre code. The rationale is to fold all software fixes and enhancements back into official Lustre source repository, in order to avoid the prospect of multiple (and incompatible) implementations roaming around the ecosystem. “We absolutely refuse to fork the system,” declares Morse. “We intend for Oracle to be the canonical definition of Lustre.”

The initial focus for OpenSFS will be to support and stabilize the current Linux-based Lustre storage systems in production at HPC installations around the world. This is especially critical for the array of US Department of Energy labs, who have very large Lustre storage systems deployed, and even larger ones on the drawing board. The longer term goal for OpenSFS is to morph Lustre and related parallel file technologies into something that supports the transition to exascale systems several years down the road.

Requirements for new features will come out of technical working groups organized by OpenSFS, and those enhancements deemed most important will be brought forward as RFPs to the community. As a non-profit entity, OpenSFS won’t be doing the development itself, but vendors who have aggregated Lustre expertise — Whamcloud, Terascala, Xyratex, SGI, Cray, DataDirect Networks, and others — would be likely to bid on these contracts.

Funding for this work will be derived from OpenSFS membership dues, which depending on your organization’s commitment to this effort can be quite expensive. There are three different levels: The promoter level costs $500K per year, which buys you a seat on the OpenSFS board; the contributor/adopter level runs $50K, and lets you manage a working group; finally, for $5K per year you can become a support member, which allows you to participate in the working group. As you might imagine, the further you go up the membership food chain, the more influence you have over which work gets funded.

Since Lustre development and testing requires large-scale computing and storage, support for this OpenSFS-initiated development will be provided by national labs, such as Lawrence Livermore and Oak Ridge, which already have resources in place for this type of work. At LLNL, the Hyperion system is available on the lab’s unclassified network as a test bed for scaling different types of Linux cluster technologies. For the past year, Sun Microsystems (and then Oracle) used the machine for its Lustre 2.0 development. Likewise, Oak Ridge has its own test bed of storage systems from various vendors for developer access. Much of the SMP scalability work for Lustre was developed and tested at ORNL. Other research labs, both in the US and elsewhere, may end up donating their own HPC resources for Lustre development, especially if they’re looking to drive specific file system development for their own programs.

Morse says members are already lining up to join the alliance. According to him, more than 20 organizations — vendors, universities, and government labs — are ready to sign on (although he wouldn’t say at what membership levels). As soon as certain legalities of OpenSFS incorporation are finalized, they’ll begin bringing them aboard. Morse expects to attract in the neighborhood of 50 to 60 organizations.

To help that process along, next month OpenSFS is going to host an introductory meeting about the organization in conjunction with Supercomputing Conference (SC10) in New Orleans. Although they were too late to reserve a session at SC10 proper, the meeting will take place in parallel with the conference festivities. The meeting is tentatively scheduled for Tuesday, September 16 at the Ritz Carlton. Registration information will soon be available on the OpenSFS website.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Tuning InfiniBand Interconnects Using Congestion Control

July 26, 2017

InfiniBand is among the most common and well-known cluster interconnect technologies. However, the complexities of an InfiniBand (IB) network can frustrate the most experienced cluster administrators. Maintaining a balan Read more…

By Adam Dorsey

NSF Project Sets Up First Machine Learning Cyberinfrastructure – CHASE-CI

July 25, 2017

Earlier this month, the National Science Foundation issued a $1 million grant to Larry Smarr, director of Calit2, and a group of his colleagues to create a community infrastructure in support of machine learning research Read more…

By John Russell

DARPA Continues Investment in Post-Moore’s Technologies

July 24, 2017

The U.S. military long ago ceded dominance in electronics innovation to Silicon Valley, the DoD-backed powerhouse that has driven microelectronic generation for decades. With Moore's Law clearly running out of steam, the Read more…

By George Leopold

HPE Extreme Performance Solutions

HPE Servers Deliver High Performance Remote Visualization

Whether generating seismic simulations, locating new productive oil reservoirs, or constructing complex models of the earth’s subsurface, energy, oil, and gas (EO&G) is a highly data-driven industry. Read more…

Graphcore Readies Launch of 16nm Colossus-IPU Chip

July 20, 2017

A second $30 million funding round for U.K. AI chip developer Graphcore sets up the company to go to market with its “intelligent processing unit” (IPU) in 2017 with scale-up production for enterprise datacenters and Read more…

By Tiffany Trader

Tuning InfiniBand Interconnects Using Congestion Control

July 26, 2017

InfiniBand is among the most common and well-known cluster interconnect technologies. However, the complexities of an InfiniBand (IB) network can frustrate the Read more…

By Adam Dorsey

NSF Project Sets Up First Machine Learning Cyberinfrastructure – CHASE-CI

July 25, 2017

Earlier this month, the National Science Foundation issued a $1 million grant to Larry Smarr, director of Calit2, and a group of his colleagues to create a comm Read more…

By John Russell

Graphcore Readies Launch of 16nm Colossus-IPU Chip

July 20, 2017

A second $30 million funding round for U.K. AI chip developer Graphcore sets up the company to go to market with its “intelligent processing unit” (IPU) in Read more…

By Tiffany Trader

Fujitsu Continues HPC, AI Push

July 19, 2017

Summer is well under way, but the so-called summertime slowdown, linked with hot temperatures and longer vacations, does not seem to have impacted Fujitsu's out Read more…

By Tiffany Trader

Researchers Use DNA to Store and Retrieve Digital Movie

July 18, 2017

From abacus to pencil and paper to semiconductor chips, the technology of computing has always been an ever-changing target. The human brain is probably the com Read more…

By John Russell

The Exascale FY18 Budget – The Next Step

July 17, 2017

On July 12, 2017, the U.S. federal budget for its Exascale Computing Initiative (ECI) took its next step forward. On that day, the full Appropriations Committee Read more…

By Alex R. Larzelere

Women in HPC Luncheon Shines Light on Female-Friendly Hiring Practices

July 13, 2017

The second annual Women in HPC luncheon was held on June 20, 2017, during the International Supercomputing Conference in Frankfurt, Germany. The luncheon provid Read more…

By Tiffany Trader

Satellite Advances, NSF Computation Power Rapid Mapping of Earth’s Surface

July 13, 2017

New satellite technologies have completely changed the game in mapping and geographical data gathering, reducing costs and placing a new emphasis on time series Read more…

By Ken Chiacchia and Tiffany Jolley

Google Pulls Back the Covers on Its First Machine Learning Chip

April 6, 2017

This week Google released a report detailing the design and performance characteristics of the Tensor Processing Unit (TPU), its custom ASIC for the inference Read more…

By Tiffany Trader

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

Quantum Bits: D-Wave and VW; Google Quantum Lab; IBM Expands Access

March 21, 2017

For a technology that’s usually characterized as far off and in a distant galaxy, quantum computing has been steadily picking up steam. Just how close real-wo Read more…

By John Russell

HPC Compiler Company PathScale Seeks Life Raft

March 23, 2017

HPCwire has learned that HPC compiler company PathScale has fallen on difficult times and is asking the community for help or actively seeking a buyer for its a Read more…

By Tiffany Trader

Trump Budget Targets NIH, DOE, and EPA; No Mention of NSF

March 16, 2017

President Trump’s proposed U.S. fiscal 2018 budget issued today sharply cuts science spending while bolstering military spending as he promised during the cam Read more…

By John Russell

CPU-based Visualization Positions for Exascale Supercomputing

March 16, 2017

In this contributed perspective piece, Intel’s Jim Jeffers makes the case that CPU-based visualization is now widely adopted and as such is no longer a contrarian view, but is rather an exascale requirement. Read more…

By Jim Jeffers, Principal Engineer and Engineering Leader, Intel

Nvidia’s Mammoth Volta GPU Aims High for AI, HPC

May 10, 2017

At Nvidia's GPU Technology Conference (GTC17) in San Jose, Calif., this morning, CEO Jensen Huang announced the company's much-anticipated Volta architecture a Read more…

By Tiffany Trader

Facebook Open Sources Caffe2; Nvidia, Intel Rush to Optimize

April 18, 2017

From its F8 developer conference in San Jose, Calif., today, Facebook announced Caffe2, a new open-source, cross-platform framework for deep learning. Caffe2 is the successor to Caffe, the deep learning framework developed by Berkeley AI Research and community contributors. Read more…

By Tiffany Trader

Leading Solution Providers

How ‘Knights Mill’ Gets Its Deep Learning Flops

June 22, 2017

Intel, the subject of much speculation regarding the delayed, rewritten or potentially canceled “Aurora” contract (the Argonne Lab part of the CORAL “ Read more…

By Tiffany Trader

Reinders: “AVX-512 May Be a Hidden Gem” in Intel Xeon Scalable Processors

June 29, 2017

Imagine if we could use vector processing on something other than just floating point problems.  Today, GPUs and CPUs work tirelessly to accelerate algorithms Read more…

By James Reinders

Russian Researchers Claim First Quantum-Safe Blockchain

May 25, 2017

The Russian Quantum Center today announced it has overcome the threat of quantum cryptography by creating the first quantum-safe blockchain, securing cryptocurrencies like Bitcoin, along with classified government communications and other sensitive digital transfers. Read more…

By Doug Black

MIT Mathematician Spins Up 220,000-Core Google Compute Cluster

April 21, 2017

On Thursday, Google announced that MIT math professor and computational number theorist Andrew V. Sutherland had set a record for the largest Google Compute Engine (GCE) job. Sutherland ran the massive mathematics workload on 220,000 GCE cores using preemptible virtual machine instances. Read more…

By Tiffany Trader

Google Debuts TPU v2 and will Add to Google Cloud

May 25, 2017

Not long after stirring attention in the deep learning/AI community by revealing the details of its Tensor Processing Unit (TPU), Google last week announced the Read more…

By John Russell

Groq This: New AI Chips to Give GPUs a Run for Deep Learning Money

April 24, 2017

CPUs and GPUs, move over. Thanks to recent revelations surrounding Google’s new Tensor Processing Unit (TPU), the computing world appears to be on the cusp of Read more…

By Alex Woodie

Six Exascale PathForward Vendors Selected; DoE Providing $258M

June 15, 2017

The much-anticipated PathForward awards for hardware R&D in support of the Exascale Computing Project were announced today with six vendors selected – AMD Read more…

By John Russell

Top500 Results: Latest List Trends and What’s in Store

June 19, 2017

Greetings from Frankfurt and the 2017 International Supercomputing Conference where the latest Top500 list has just been revealed. Although there were no major Read more…

By Tiffany Trader

  • arrow
  • Click Here for More Headlines
  • arrow
Share This