Cluster Resources All About Empowerment

By By Derrick Harris, Editor

March 6, 2006

GRIDtoday spoke with Cluster Resources CTO David Jackson about the unique capabilities of the company's Moab family of solutions, which includes cluster, Grid and utility computing suites. Said Jackson: “We do what we do well, which is empower [companies] to deliver their skills seamlessly, efficiently and reliably.”



GRIDtoday:
First, I'd like to ask how it's is going at Cluster Resources. Is everything running smoothly and going according to plan?

DAVID JACKSON: Thank you for this opportunity; it is an honor to be here. Cluster Resources continues to experience rapid growth and we look forward to the opportunities that continue to come our way. Over the years, we've enjoyed working with industry visionaries and many of the world's largest HPC organizations helping them realize their objectives. In the process, we gained a lot of expertise and it has helped propel us into a leadership position in this rapidly evolving industry. Now, many of the technologies we pioneered years ago are moving into the mainstream and from a business perspective, this transition has been excellent for us.

Gt: Can you tell me about the Moab Grid Suite? What unique benefits does it offer over other grid management products?

JACKSON: Moab Grid Suite is designed to bring together resources from diverse HPC cluster environments. It is currently used across grids that span single machine rooms and others that span nations. Moab's approach to managing these resources helps overcome some long-standing hurdles to Grid adoption by providing simplicity, sovereignty, efficiency and flexibility.

Moab provides an integrated cluster and grid management solution within a single tool, eliminating an entire layer of the standard grid software stack. With Moab, if you know how to manage a cluster, then you are ready to manage a grid. In fact, with some customers it has taken less than a minute to expand a working cluster into a full-featured grid. Moab's resource transparency allows users to take advantage of the new Grid resources with next to no changes in end user experience. For them, the grid is seamlessly connected to the local cluster, they submit the same jobs, run the same commands, and under the covers, Moab manages, translates, and migrates workload and data as needed to utilize both local and remote resources.

Another hurdle to Grid adoption has always been the protection of cluster level sovereignty. People are hesitant to lose control over their resources in spite of the benefits grids offer. With Moab, each participant is able to fully control his involvement in the grid, managing both job and information flow. They can specify ownership and QoS policies and control exactly when, where and how resources will be made available to external requestors.

As you probably already know, Moab is widely recognized for its industry- leading levels of optimization resulting in outstanding cluster performance in terms of both utilization and targeted response time. We have extended these same technologies to Grid, allowing very effective Grid solutions, even in environments with complicated political constraints, heterogeneous resources and legacy infrastructure.

A further major hurdle to Grid adoption is managing widely diverse resources. Moab is unique in that it is already running on virtually every OS and architecture and most major professional and open batch systems including TORQUE, LSF, PBSPro, Loadleveler, SLURM and others. It can operate with or without Globus, and supports multiple security paradigms as well as multiple job and data migration protocols. When a customer approaches us, we do not mandate a replacement of their existing infrastructure, but rather help them use Moab's flexibility to orchestrate their existing environment.

These concepts brought together offer a flexible solution that requires surprisingly little training, is very intuitive for the end user, and can effectively deliver on each of the major benefits of Grid computing.

Gt: How many customers do you have for the Grid suite? In what industries are most of the customers involved?

JACKSON: Use of Moab Grid technology is widespread and continues to grow Rapidly, but giving exact values is difficult because our products intentionally blur the line between clusters and grids. Moab offers a full spectrum of Grid technologies providing multi-cluster scheduling, enterprise- level monitoring and management, information services, Grid portals, job translation, centralized identity and allocation management, job staging, data staging, credential mapping, etc. Consequently, many sites are using Moab's Grid tools and technologies as a natural extension of their clusters and, without even knowing it, have enabled a grid across their systems. I think this is the way it should be. In the beginning of Grid, there were many sites afraid to take the “big leap” into Grid because they feared breaking what they had; they feared the unknown. With Moab, there really isn't a leap. You flip a bit and you are sharing jobs, flip a bit, and you are coordinating Grid accounting. It's just a natural extension of the familiar cluster.

In fact, as part of this blurring of lines, our Cluster Suite includes the ability to connect up a local-area-grid. Only when you begin to need more complex data staging and credential mapping is the Moab Grid Suite even required.

Regarding industries, I recently looked at a report showing our customer breakdown and it was all over the place. We are in financial, oil and gas, research, manufacturing, academic and everything in between. Because of our roots of inter-operating with all major batch systems, we've had to develop a superset capability. We have found that this has opened many doors for us and our customers are drawn by cost- effectiveness, simplicity, scalability and flexibility, not by industry.

Gt: Cluster Resource's Moab Utility/Hosting Suite offers an interesting approach to utility computing by letting users host their own resources, much like several of the large IT vendors (e.g., Sun Grid, IBM Deep Computing On Demand, etc.). Has there been a lot of interest in this service thus far?

JACKSON: We are very excited about utility computing, as we see this being the next natural step in the evolution of grids. The technology adoption time frame is long, but interest continues to grow and the benefits we've provided to clients have been both significant and pervasive. For example, one Fortune 500 customer increased the amount of services they were able to provide by 300 percent in the first year, and a different Fortune 500 customer was able to increase their customer base by over 50 times with Moab effectively exposing their services to customers via utility computing.

In a nutshell, what we offer with the Moab Utility/Hosting Suite is the ability to intelligently provision, customize, allocate and tightly integrate remote resources. This technology applies to both batch and non-batch environments, and many, many usage scenarios. Imagine a cluster where a user submits jobs and eventually the cluster fills up and responsiveness slows. Suddenly, the cluster gets bigger, all the jobs run to completion, and then the cluster shrinks back down again. Imagine a cluster where you submit a job requesting a compute architecture that does not exist. Moments later, that resource exists and your job runs. Imagine losing 16 nodes due to a hard drive failure and by the time you get back from lunch, Moab has notified you of the failure, created a reservation over the failed nodes, sent a replacement request off to your hardware provider, and replaced every failed node with an equivalent hosted computing node. Your boss says, “Nice job, perfect uptime again this month!”

Imagine setting up a business relationship with a utility computing hosting center that absolutely guarantees resource availability on fixed days and times, or guarantees a fixed number of cycles per week or guarantees a one- hour response time for unplanned resource consumption. Imagine being able to host not just compute resources, but a full customized service on demand. Offer data mining of a massive data set, offer regression testing services across a wide array of architectures and environments, offer not just software, but the full environment required to use that software.

Moab can provide this right now and, when you think about it, it seems quite natural that this is the way things should have been done all along. How do you say no to this type of solution? Organizations can use Moab to tap into IT vendor resources or can set up their own hosting solution for internal and external customers.

It is important to understand that utility computing is not just about making raw compute resources available on demand. It is making them custom, secure, guaranteed, tightly integrated and seamless. Our patented software allows IT vendors to ship a product that enables fully automated or “touch of a button” connectivity to the hosting center or service, with dynamic security, service level guarantees, automated billing, all with one button.

Gt: What led to this approach versus the company trying to sell its resources to users?

JACKSON: We are an enablement company. We create technology and software that allows other organizations to really capitalize on their offerings. Google made a smart move when it chose to not make content. Google is exceptional in what it does, but it does not compete with “subject experts.” Remember that utility computing is more than delivering raw cycles, it is about delivering a full compute environment ready to accomplish a specific task. An organization that works with oil and gas companies will already have relationships with them, it will know what network, storage and compute solutions work best, and will know what security constraints must be satisfied. Our software allows such an organization to automatically customize and deliver this environment in minutes — on demand, on-the-fly. This company probably knows more about its customer than we will ever know, and it makes sense that they offer this service.

We worked with Amazon to enable their recently announced online Internet data mining service. We didn't know much about mining the entire Internet and we did not need to. Amazon knew their data, their services, and their customers. We helped them set up a system where a user presses a button and, on-the-fly, a new cluster is built from scratch with secure network, compute and storage facilities. The source data is automatically pre-processed, the compute nodes are customized, the needed applications are automatically started, and an entire data mining environment is created in minutes. With our system, Amazon was able to take their expertise and scale it, allowing them to focus on what they do best and delivering the benefits to a far larger customer base.

Another space we are currently working in is to provide large security-focused government organizations with instant access to vast-quantities of additional HPC resources in the event of a national disaster. Moab Utility/Hosting Suite is being used by these government organizations to instantly overflow national emergency workload onto participating government, academic and corporate sites. At first this sounds like a grid, but, in reality, each of these sites is a separate environment that is prepared only for the local workload until Moab adapts these environments with many needed changes to create a cohesive environment that is able to respond to the national disaster.

Tier 1 and tier 2 hardware vendors already have a relationship with their customers. It would make sense for them to provide cluster overflow and emergency failover-based utility computing services. They know the customers and the technology being shipped. Our job is to empower these vendors to provide this service more effectively and efficiently than they could ever do on their own.

“Boutique” utility computing allows any software or IT service company to deliver complete custom “solutions” to its customers using insight and relationships we can never hope to have. We do what we do well, which is empower them to deliver their skills seamlessly, efficiently and reliably.

I think this is a pure win-win situation. We win, the customers win and the vendors win.

Gt: Do you know whether more users of the utility/hosting suite are using the solution for internal or external purposes?

JACKSON: Its a mix. Right now, we see more “soft” utility computing for internal purposes and more “hard” utility computing for external purposes. Soft utility computing is being used to enable condominium clusters, dynamically reconfigurable grids, automated failure recovery and other services. Hard utility computing is driving the big allocations of raw resources with provisioning of full customized service environments.

Gt: There is sometimes confusion about cluster computing vs. Grid computing vs. utility computing. Seeing as how your company sells solutions for all three, can you do your best to clarify these terms?

JACKSON: This is an industry with fluid terms, so any definition we give will be subject to debate. And, again, we are a workload and resource management company, so our focus is based on what tasks are required to fully optimize these systems. With these caveats, we see cluster computing focusing on maximizing the delivered science of one or more clusters under a single administrative domain. Grid computing focuses on bringing together resources that are under administrative domains with diverse mission objectives but with a common goal of extracting maximal performance across all systems. Proper Grid computing allows each organization complete independence and creates a consistent, easy-to-use global view for management and optionally end users. Grids and Grid relationships are generally worked out ahead of time and are generally static.

Utility computing is the next frontier. It is taking everything that is good about clusters and grids and adding the ability to first, dynamically establish relationships, and second, build complete compute environments. These relationships are completely flexible, but encompass new service guarantees, charging and workload management protocols. The compute environments can be built on-the-fly and are holistic, incorporating network, storage, compute and software resources together with supporting services.The key to utility computing is perfect transparency and tight integration. When the customer needs it, his cluster just gets bigger or changes to become what is needed for the workload. When a node or a network goes down, it gets replaced. Yes, there is a lot of things going on behind the scenes to create this magic, but, to the end user, it's all magic.

Building on the shoulders of Grid, utility computing allows the next generation of high performance computing and data centers, the true on demand vision.

Gt: Can you tell me a little about your background in HPC? It looks like you've done quite a bit before coming to Cluster Resources.

JACKSON: I've had the good fortune of working with many leaders in the Grid effort, as my career has taken me to IBM, NCSA, SDSC, LLNL, MHPCC, PNNL and a few other locations before starting with Cluster Resources. These early days also involved consulting and volunteer work directly helping over 1,000 sites manage their clusters. These experiences were invaluable and helped shape not only our cluster, Grid, and utility computing products, but our whole approach in delivering it. I found that organizations that were highly competent and highly agile were also a joy to work with and that we could jointly enable new technologies to overcome any obstacle in amazingly short amounts of time. Other organizations did not seem to get this paradigm and, though very big, were unable to detect the pulse of the industry.

We have tried very hard to keep that agility alive at Cluster Resources, with dozens of joint research projects throughout the world, solid relationships with many of the industry visionaries and a support team that generally resolves all issues in under two hours. Through this combination, we have found amazing customer loyalty. In fact, over the years, we have not lost a single customer!

Gt: I see you're a founding member of the GGF scheduling working group. How active are you in the GGF right now?

JACKSON: I was fortunate to be involved with the GGF and its precursor organizations way back in the very early days. In fact, so early that we could fit all sites around one small table! That was definitely enjoyable talking about these grand ideas and world-changing technologies. I have a lot of good memories from those days. Over the years, we've continued to be involved with GGF in many different ways working on protocols, directions and standards, though less involved in the formal meetings.

Gt: Finally, I'm wondering if you could give our readers a little insight into your life outside of the office. What are your personal hobbies and interests? What are your plans for when your working days are done?

JACKSON: In terms of hobbies, I am an avid hiker with a particular love of high mountains and narrow slot canyons.

There is no question what I'm doing when my working days are done — I'm farming! I spent most of my growing up years on a farm in Idaho and absolutely loved it. There's just something about fresh air and hard work that makes the soul feel good. We learned to work very hard and do it right and that experience is something I very much want to share with my kids.

Shares
to deliver their skills seamlessly, efficiently and reliably." Read more…" share_counter=""]
Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion XL — were added to the benchmark suite as MLPerf continues Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing power it brings to artificial intelligence.  Nvidia's DGX Read more…

Call for Participation in Workshop on Potential NSF CISE Quantum Initiative

March 26, 2024

Editor’s Note: Next month there will be a workshop to discuss what a quantum initiative led by NSF’s Computer, Information Science and Engineering (CISE) directorate could entail. The details are posted below in a Ca Read more…

Waseda U. Researchers Reports New Quantum Algorithm for Speeding Optimization

March 25, 2024

Optimization problems cover a wide range of applications and are often cited as good candidates for quantum computing. However, the execution time for constrained combinatorial optimization applications on quantum device Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at the network layer threatens to make bigger and brawnier pro Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HBM3E memory as well as the the ability to train 1 trillion pa Read more…

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HB Read more…

Nvidia Looks to Accelerate GenAI Adoption with NIM

March 19, 2024

Today at the GPU Technology Conference, Nvidia launched a new offering aimed at helping customers quickly deploy their generative AI applications in a secure, s Read more…

The Generative AI Future Is Now, Nvidia’s Huang Says

March 19, 2024

We are in the early days of a transformative shift in how business gets done thanks to the advent of generative AI, according to Nvidia CEO and cofounder Jensen Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. Wh Read more…

Alibaba Shuts Down its Quantum Computing Effort

November 30, 2023

In case you missed it, China’s e-commerce giant Alibaba has shut down its quantum computing research effort. It’s not entirely clear what drove the change. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Leading Solution Providers

Contributors

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Google Introduces ‘Hypercomputer’ to Its AI Infrastructure

December 11, 2023

Google ran out of monikers to describe its new AI system released on December 7. Supercomputer perhaps wasn't an apt description, so it settled on Hypercomputer Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Intel Won’t Have a Xeon Max Chip with New Emerald Rapids CPU

December 14, 2023

As expected, Intel officially announced its 5th generation Xeon server chips codenamed Emerald Rapids at an event in New York City, where the focus was really o Read more…

IBM Quantum Summit: Two New QPUs, Upgraded Qiskit, 10-year Roadmap and More

December 4, 2023

IBM kicks off its annual Quantum Summit today and will announce a broad range of advances including its much-anticipated 1121-qubit Condor QPU, a smaller 133-qu Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire