Since 1986 - Covering the Fastest Computers in the World and the People Who Run Them

Language Flags
September 9, 2010

The Evolution of Oracle Grid Engine

Wolfgang Gentzsch

Two articles appeared last week on Oracle Grid Engine (aka Sun Grid Engine) which speculated about its future at Oracle: Douglas Eadline wrote in Linux Magazine about The State of Oracle/Sun Grid Engine, and Nicole Hemsoth’s Oracle Placing GridEngine on New Track, here in HPC in the Cloud. These articles encouraged me to contribute my own thoughts to this discussion, especially with a focus on Grid Engine’s contribution to Clouds.

Over the last years the Sun Grid Engine team and the Grid Engine open source community have continuously added more features to the already powerful distributed resource management system which Sun Acquired in 2000 together with German/US based Gridware. Recently, the software got into Oracle’s hands, and many have speculated that Oracle will drop the HPC related bits and pieces which came with the acquisition of Sun. Some might doubt that Oracle management will ever be ever able to recognize the value of HPC technology for mainstream IT, but especially the value of resource managers like Grid Engine for any kind of distributed resource environment.

More and more features appeared which were going far beyond the narrow focus of HPC, e.g. policy and priority management, scalability of hundreds of thousands of jobs – large and small, user authentication and access control, resource assignment across persistent services, managing software build, test, and verify, data management, aligning resource usage with business policies, an accounting and reporting module ideally suited for the Cloud’s pay-per-use, and many more; features which are very useful for minimizing cost and maximizing the business value of an organizations computing resources and software assets, not only in HPC.

Then, about three years ago, discussions and developments started in the Grid Engine open source community about resource elasticity  in clusters, culminating in the Service Domain Manager providing any cluster size on the fly according to an application’s requirements. Through the Grid Engine Service Domain Manager software, an administrator can configure service-level objectives to govern service levels in the managed clusters.

In addition, the Service Domain Manager software is able to remove unused resources from managed clusters and place them in a spare pool of resources. Resources in this spare pool optionally can have power management applied to reduce a cluster’s overall power consumption during off-peak periods.

Should no free resources be available locally, the Service Domain Manager software also has the ability to provision resources from a compute cloud provider, such as Amazon’s Elastic Compute Cloud (EC2), to add to an overloaded cluster (a feature known as Cloud Bursting). This Cloud Connectivity allocates nodes on the Cloud (EC2) on demand, providing full elasticity: compute resources allocated through it can go from 0 to whatever is needed and covered by the user’s budget, fully policy controlled, no user intervention required. It includes Secure Communication: OpenVPN, part of EC2 AMI and of OGE instance running on user laptop or desktop. Beyond that, Grid Engine now offers deep integration to other technologies commonly being used in the cloud, such as Apache Hadoop, a powerful tool designed for deep analysis and transformation of very large data sets. Or the UniCloud environment from  UnivaUD. In fact, because of Grid Engine’s standard APIs, enhancement with almost any other management tool seems possible.

With all this in mind, I suggest to call OGE the ‘Oracle Resource Engine’, where ‘Resource’ includes hardware, whether in-house, in Grids, or in Clouds, and workloads, applications, data, and users, individual and in real or virtual organizations. This goes far beyond HPC, and therefore I suggest that this Oracle Resource Engine will survive another 10 years, as it survived the 10 previous years even (or because of) Sun Microsystems

SC14 Virtual Booth Tours

AMD SC14 video AMD Virtual Booth Tour @ SC14
Click to Play Video
Cray SC14 video Cray Virtual Booth Tour @ SC14
Click to Play Video
Datasite SC14 video DataSite and RedLine @ SC14
Click to Play Video
HP SC14 video HP Virtual Booth Tour @ SC14
Click to Play Video
IBM DCS3860 and Elastic Storage @ SC14 video IBM DCS3860 and Elastic Storage @ SC14
Click to Play Video
IBM Flash Storage
@ SC14 video IBM Flash Storage @ SC14  
Click to Play Video
IBM Platform @ SC14 video IBM Platform @ SC14
Click to Play Video
IBM Power Big Data SC14 video IBM Power Big Data @ SC14
Click to Play Video
Intel SC14 video Intel Virtual Booth Tour @ SC14
Click to Play Video
Lenovo SC14 video Lenovo Virtual Booth Tour @ SC14
Click to Play Video
Mellanox SC14 video Mellanox Virtual Booth Tour @ SC14
Click to Play Video
Panasas SC14 video Panasas Virtual Booth Tour @ SC14
Click to Play Video
Quanta SC14 video Quanta Virtual Booth Tour @ SC14
Click to Play Video
Seagate SC14 video Seagate Virtual Booth Tour @ SC14
Click to Play Video
Supermicro SC14 video Supermicro Virtual Booth Tour @ SC14
Click to Play Video