As part of its resource contributions to the NSF-funded TeraGrid, Indiana University is now making tape-based storage available to users of the TeraGrid via Indiana University's Massive Data Storage System (MDSS), which uses the High Performance Storage System (HPSS) software. Researchers with TeraGrid allocations may store up to one terabyte of data within IU's HPSS system, and data will remain available for one year past the end of the user's allocation.
HPSS is software that manages large volumes of data on disk and in robotic tape libraries, aggregating the capacities of many physical storage devices into a single, virtually infinite file system. HPSS was developed beginning in 1992 as a collaboration between IBM Global Services along with five Department of Energy laboratories to address the growing challenges of capacity, I/O and functionality in massive storage systems. The HPSS architecture enables superb scalability of transfer rates and data capacity, meeting the requirements of national and international academic institutions, government agencies, and other organizations that need to store in a single namespace the largest sets of data currently being collected.
The HPSS system at IU, with a total capacity of more than 2.2 petabytes, is the first and only HPSS installation to implement distributed data movers. Indiana University's installation of HPSS is very unusual in that IU maintains geographically separate data silos in Bloomington and Indianapolis, IN. Users who store data in IU's HPSS system have the option of keeping two copies — one in Bloomington, one in Indianapolis — ensuring that data are stored reliably even in the event of the destruction of one of the machine rooms.
Access to store and retrieve files from IU's HPSS system is now available from every TeraGrid site via the Hierarchical Storage Interface (HSI). IU's initial usage policy for this resource will be to provide, by default, up to one terabyte of storage, representing 500 GBs of data safely stored in multiple locations, to TeraGrid users. The availability of IU's HPSS system will be of particular value to the “TeraGrid Wide” strategy. This is the TeraGrid project's goal to make the TeraGrid widely valuable to a large portion of the nation's research community. Advanced users are expected tofind great value in the massive storage available in IU's HPSS system, but perhaps more importantly, availability of one terabyte of storage will be of value to many researchers throughout the nation who do not have access at their own home institution to a sophisticated archival data storage system.
It is important to note that researchers do not have to compute on the TeraGrid to take advantage of the data storage capabilities provided, via the TeraGrid, from Indiana University. Researchers may request an allocation through the Developmental Allocation Committee, receive a small allocation of computing time and simply use their TeraGrid credentials for access to the IU HPSS system.
For information on how to get an account on the TeraGrid, go to http://kb.iu.edu/data/anql.html. For specific detailed instructions on how to access Indiana University's HPSS system via the TeraGrid, see http://kb.iu.edu/data/arux.html. For additional information about HPSS, visit the home page of the HPSS Collaboration here http://www.hpss– collaboration.org. For information about the TeraGrid, see http://www.teragrid.org. For more about high performance computing at IU, visit http://uits.iu.edu/scripts/ose.cgi?amee.help. For more information about IU's contributions to the TeraGrid, see http://iu.teragrid.org/index.html.
About the TeraGrid: IU is one of eight resource partners contributing to the TeraGrid, along with the National Center for Supercomputing Applications, Oak Ridge National Laboratory, Pittsburgh Supercomputing Center, Purdue University, San Diego Supercomputer Center, Texas Advanced Computing Center and Argonne National Laboratory. The TeraGrid combines computational, storage, network and visualization resources from these partner sites to create a tremendous integrated resource to support scientific research.
The TeraGrid was launched by the National Science Foundation in 2001. Indiana University was added as a resource partner in 2003. IU plans to continue through their participation in the TeraGrid what they have long done for local researchers, to continue to focus on data-centric science and support for the Life Sciences.
Scott McCaulay is Indiana University's TeraGrid Site Lead. Thomas Hacker is associate director for reseach and academic computing at Indiana University. Andrew Arenson is manager of the Distributed Storage Services Group at Indiana University.