The Open Grid Forum (OGF) recently held its Second Workshop on Reliability and Robustness in Grid Computing Systems at OGF19 in Chapel Hill, NC on January 31, 2007. The workshop organized through the eScience OGF function, brought together researchers and engineers actively working on Grid computing systems with the goal of promoting better understanding of reliability issues and requirements. The focus of this workshop was on strategies and techniques for promoting grid systems reliability.
An important area of interest for the workshop was reliable, fault-tolerant Grid system architectures. Presentations were given on a commercial Grid system architecture in which a fail-over strategy is used to ensure service availability and reliability, a Grid system design for monitoring and dynamic reconfiguration in response to component failure, and a fault-tolerant management architecture for web and OGSA services with scalable overhead costs.
Another area of focus was the impact on Grid reliability of interactions among Grid services. Work was reported on how interconnections between software components emerge to form clusters around key hub components and the potential impact of this phenomenon on grid COTS reliability. Recommendations were also presented on strategies for enhancing reliability of OGSA implementations in the face of complex service interactions. Copies of presentations are available at http://gridreliability.nist.gov/.
Workshop participants made suggestions on how to promote Grid system reliability, such as developing accompanying guidelines for specifications to help facilitate reliable implementations and steps to ensure specifications do not inadvertently lead to unreliable implementations. The results of the workshop will be incorporated into the production of an OGF informational document scheduled for publication at the end of this year. The document is intended to serve as a resource to improve standard OGF and Web Service specifications and to enhance the reliability of industrial grid implementations. The OGF Reliability and Robustness Research Group is actively soliciting participants to contribute to this effort.
Please contact Christopher Dabrowski [email protected] for further information.