It appears the adage that “The more things change, the more things stay the same” is not far off the mark. Distributed computing will be heretofore considered as the foundation from which the “newer” technologies of grids, fabrics, and clouds have naturally evolved. This premise will be carried forward as the fundamental basis for present and future discussions in this sphere.
Distributed systems can be defined as “a collection of independent computers that appears as a single coherent system” [1]. Advancements in computing and information technology have led to better software, hardware and networks functioning in ways that enable users to do significantly more than in the past. The advances naturally prompt users to push the boundaries for expanded capabilities. An inherent broadening of the scope and types of applications that can be addressed, as well as, a reduction of the boundaries and limitations on applications are a natural outcome of the progress noted above. For example, the promise of multi-core programming and its multi-processor systems has enabled the “distributed” nature of distributed systems to be no longer limited to individual, independent computers, but now the distribution can be observed within a single computing system. Multi-core architectures, through the use of software multithreading helps facilitate better performance, especially within the realm of enhancing speed [2]. Although the appropriate software is a critical factor in the speed up, it is the proper design of the applications in a threaded environment that is the key to the faster processing. We will return to the merits of focusing attention on the applications and their respective designs later.
It is important to note some of the characteristics of distributed systems, including:
• Differences between the various computers and the ways in which they communicate and are hidden from users;
• The internal organization of the distributed system is hidden from users;
• Users and applications can interact with a distributed system in a consistent and uniform way, regardless of where and when interactions take place;
• Solutions should be easy to expand or scale [1].
The characteristics noted above are, not too surprisingly, the focus of many of the “new” technologies that have enlisted our attention and discussions, e.g., with:
• Grids, the focus is on sharing resources from multiple administrative domains for a common goal;
• Fabrics, the focus is on the integration or connection of nodes of resources to facilitate consolidated computing; and last but certainly not least;
• Clouds, the focus is on shared resources that are available on an “as-needed” or “on demand” basis.
The author uses the parallels in characteristics noted above to reinforce the view that the overlap is unavoidable between the “newer” concepts and distributed computing because of the inherent nature and promise of distributed computing. However, in spite of the similarities and parallels, there are indeed some very important and interesting distinctions with the “feel” of cloud computing.
The journey travelled by grids and clouds tend to have taken very different routes. The industry sector lagged behind in the development and implementation of grids and was constantly searching for the “right” business niche and potential ROI to validate any significant investment in the technology. SLAs, accountability and responsibility were always major points of contention in the cooperative sharing that was promoted by grid technologies. For example, the defining of the entity, person(s), or organization(s) responsible for data (i.e., caretakers of data within the respective shared environment) were always ongoing issues that had to be negotiated. No one, for obvious reasons, was anxious to be held responsible for data being corrupted, inaccessible, or unsecured because of IP and legal issues that could be incredibly expensive to companies and collaborators to work through to the mutual agreement of all. The ability to accommodate grids within an overall mainstream, comprehensive strategy has also been a continuous challenge for IT administrators and their user communities.
Of special note is the fact that the industry sector not only got in on the ground floor with the development and implementation of clouds, but more than any other sector, is noted for actually leading the movement. The prospective business niches were inherent for clouds from the beginning and did not have to be developed along the way as was the case for grids. A major barrier for grids that is notably less with respect to clouds is that there were always challenges associated with the inherent sharing facilitated by Grids, i.e., the cost structure relevant to the sharing between the participants. In addition, clouds have the added dimension of variability in its possible structure, including they may exist in either private, public or hybrid configurations—an option much more limited for grid environments.
We have focused our attention on the distinctions especially between clouds and grids, but also would like to consider some of the similarities that may help us better understand and utilize the technologies to more efficiently address our applications and challenges. The manner in which data is secured to better address issues of privacy, integrity and access remain a keen interest and concern. Never more dominant is the concern within the intelligence community, where real-time, secure, and accurate exchanges can be matters of national and global security. It is important to note as alluded to earlier that our focus will be on scientific applications, wherein large-scale volumes of data are a natural part of the problems that are to be addressed. It is also within the author’s view that the scientific applications are important because they really test the bounds and limitations of our present resources and require innovative and creative solutions to resolve the challenges. We would hope that the solutions could be extended to also benefit other applications. In addition, we stated previously that the design of the application should be viewed as a significant factor as we gain to learn more while improving our understanding of ways to improve performance and solutions. The applications are also important because they drive not only the resources needed to facilitate the results and information, but also the resolution in the outcomes and ultimately, the progress we should hope to achieve.
It is imperative that we reiterate our focus on the data generated by the applications and our intent to navigate through the many different areas/classes of scientific applications that the user community brings forward. Together, we hope to identify and share some of the challenges, progress, approaches, and lessons learned in addressing a wide range of applications. The goal is to have an interactive dialogue and exchange that facilitates an improvement and better understanding of HPC applications “In the Cloud”.
[1] Tanenbaum, A.S. and Steen, M. V. , Distributed Systems: Principles and Paradigms, p.2, Prentice Hall, 2002
[2] Akhter, S. and Roberts, J., Multi-core Programming: Increasing Performance through Software Multithreading
—-
HPCintheCloud contributor John Hurley is the Principle Investigator and Director for the National Nuclear Security Administration and DOE sponsored Center for Disaster Recovery.