SCIENCE & ENGINEERING NEWS
Champaign, IL — Massive amounts of data, beamed to earth by satellites and other sensors that are part of NASA’s Global Change Research Program, will depend on software developed by the National Center for Supercomputing Application’s (NCSA) at the University of Illinois, Urbana-Champaign, for data storage and management.
A version of NCSA’s Hierarchical Data Format (HDF) – a package of software, libraries, and tools for analyzing, visualizing, and converting scientific data – is already in use with data being gathered by NASA’s Terra satellite, one of the Earth Observing Systems (EOS) launched to collect data for the Global Change Research Program. Terra, which was launched in December 1999 and officially began transmitting data last April, uses HDF4. Another EOS satellite, called Aqua, is expected to launch in July 2001 and will also utilize HDF4. Aura, the third EOS satellite, is expected to launch in 2002 and will rely on a newer version of HDF called HDF5.
“NASA’s commitment to HDF for the EOS sensors dates back to 1993, but the fact that they plan to use HDF5 for the newest EOS satellite is a real boost for this major upgrade of HDF,” said Mike Folk, technical program manager of the HDF group at NCSA. “NASA’s commitment to HDF, and especially their willingness to adopt HDF5, is likely to spur interest from other scientific organizations. It is also our hope that this commitment will spur software vendors to support HDF in their visualization and analysis tools.”
Each EOS satellite will deliver about a terabyte of data per day from an assortment of instruments. The satellites are part of a comprehensive NASA program to study the Earth as an environmental system, an effort that could lead to improved weather forecasts, tools for managing agriculture and forests, and a better understanding of global warming. About 1,000 scientists are the direct users of EOS data and secondary users, from additional scientists to school children, are estimated at 30,000.
HDF is an important component in this massive sharing of scientific data because it allows users to move and share data regardless of the computing platform they use. HDF can handle visual, as well as numerical, data and allows scientists to include peripheral information about the data in their datasets. Within HDF files, data can be organized in ways that suit the needs of those who access it. For example, temperature data could be organized in HDF in such a way that a scientist who wants to study temperature data can examine that data over time or at different levels in the atmosphere.
“The flexibility of HDF and the ability to look at the data as it relates to other variables makes the data more useful and usable over long periods of time,” said Bruce R. Barkstrom, a researcher at NASA’s Langley Research Center and a Principal Investigator on part of the Global Change Research Program called Clouds and Earth Radiant Energy Systems (CERES). Barkstrom added that HDF and the upgrade to HDF5 “will give us some very substantial benefits as we move downstream with this program.”
NASA first chose NCSA’s HDF as the standard file format for EOS data in 1993, and provided funding to create an EOS implementation of the software, called HDF-EOS. It is the standard format for all EOS peer reviewed science data products. NCSA is currently working with NASA to implement HDF-EOS 5.0, which will be the standard used to define data from the Aura satellite. Although over time datasets from the Terra and Aqua satellites may have compatibility with HDF5, those first two satellites are already committed to the HDF-EOS implementation using HDF 4.
“Our goal is to make the new HDF-EOS API as backward compatible as practical,” said Richard Ullman, information architect for the EOS Earth Sciences Data and Information Systems (ESDIS). He added a beta release of HDF-EOS 5 is already available for the Aura science data producers to use in defining their products and that a stable and more fully tested version of the application is expected before next spring.
NCSA teamed with three Department of Energy (DOE) national laboratories – Lawrence Livermore, Los Alamos, and Sandia – to develop HDF5. The DOE’s Accelerated Strategic Computing Initiative (ASCI) has also adopted HDF5. According to Folk, HDF5 offers several improvements over previous versions: it can handle files of unlimited size; the user can access subsets of data within datasets more efficiently; and the software offers parallel I/O for parallel computing environments, which can greatly speed up the data transfer and storage process.
For more information on HDF, see http://hdf.ncsa.uiuc.edu . For more on the NASA EOS satellites, see http://eos.nasa.gov . For information about HDF-EOS, see http://hdfeos.gsfc.nasa.gov .
The National Center for Supercomputing Applications is a leading-edge site for the National Computational Science Alliance. NCSA is a leader in the development and deployment of cutting-edge high-performance computing, networking, and information technologies. The National Science Foundation, the state of Illinois, the University of Illinois, industrial partners, and other federal agencies fund NCSA.
The National Computational Science Alliance is a partnership to prototype an advanced computational infrastructure for the 21st century and includes more than 50 academic, government and industry research partners from across the United States. The Alliance is one of two partnerships funded by the National Science Foundation’s Partnerships for Advanced Computational Infrastructure (PACI) program, and receives cost-sharing at partner institutions. NSF also supports the National Partnership for Advanced Computational Infrastructure (NPACI), led by the San Diego Supercomputer Center.