Walter Stewart on SGI’s Role in Ever-Evolving World of Grid

By By Derrick Harris, Editor

March 14, 2005

A lot has changed since 1997, when SGI put on the first public Grid demonstration at the Supercomputing show. Walter Stewart, SGI's business development manager for Grid, spoke with GRIDtoday editor Derrick Harris about just what has changed since then, as well about the company's tactics in the battle against increasingly large data spikes.

GRIDtoday: How do the new products mentioned in SGI's current release [the Altix 1350, Altix Hybrid Cluster, InfiniteStorage Total Performance 9700 and Silicon Graphics Prism] advance SGI's Grid strategy?

WALTER STEWART: We have, for some considerable time, ensured that any new product that we bring out is Grid-enabled. We have been looking to bring out products that bring a unique functionality to Grids, and we believe that these four products advance the kinds of functionalities that SGI is able to make available to people who are operating Grids. Particularly in the case of the 1350, we are bringing in functionality at a much more attractive price point than has been available before.

I think that speaks to SGI's overall Grid position. We're out there to be a toolmaker for the Grid and to make sure that the kind of power SGI brings to stand-alone compute facilities is available to hugely distributed users.

Gt: Do you know of any projects off-hand that are using or planning on using any of these new solutions?

STEWART: Because we've only just released them, I'm not aware of any that are looking at them for Grid at the moment. Certainly, some products from the Altix family have already been installed in major Grid installations around the world. We've had circumstances where we've had customers who are not interested in large, shared-memory machines, but who are interested in what might be described as “robust node clusters” for their Grids, and that's precisely what the Altix 1350 addresses. We are certainly very much involved with the Altix family in a number of major Grid installations around the world.

Gt: Which leads to my next question. There are certainly some well-established projects currently powered by SGI solutions, including the TeraGyroid, COSMOS and SARA projects. Could you talk a little about how SGI products are being used in these, and other, Grid projects?

STEWART: One interesting one, in your own country, is that we installed at the beginning of the year a 1,000-plus processor machine at NCSA, which will be one of the resources on the TeraGrid. This is, as far as I'm aware, the first shared-memory resource that has been available to TeraGrid users. So that's one very recent, and North American example.

I think I'm right in saying that it's next month that we're installing another 1,600-plus processor machine at the Australian Partnership for Advanced Computing, which will become a major resource on the Australian Grid.

COSMOS Grid has been around for some time, and we've been through a couple of generations with COSMOS Grid. We first installed Origin there, and have subsequently installed Altix. This is all because the COSMOS Grid people are in the business of setting up the data environment, including processing and visualization, in order to be ready for the data that will come flooding in from the Planck satellite in 2007. This is an example of bringing real power to Grids with a very strong emphasis on a very close connection among compute, visualization and data management.

Gt: SGI is focused on addressing four primary challenges of Grid, and I want to talk about two in particular. First: Why is security such a big issue in Grid computing, what are some of the major security issues and what is SGI doing to improve Grid security?

STEWART: We've been doing a lot of work with the open source community in transferring a lot of IP from our experience with our own operating system, IRIX. There are some security issues that we're hopeful will be picked up by the open source community. Because a lot of our security work with IRIX was right in the OS, we feel obliged to work with the open source community, and move at their speed, on the introduction of some of those attributes.

One area that isn't talked about in security, or security-related issues, that SGI is very preoccupied with is the whole issue of versioning. It's one thing to talk about the security of data, it's another thing to talk about the integrity of data. With our CXFS storage-area network over a wide-area network on the Grid, we are solving the problem of making multiple copies of data. So if you're in San Diego and I'm in Toronto, we could be working on the same data set without having to make a copy for each of us. Therefore, we can be confident that the data I'm working with is the same as you are because we're using the same copy.

We also keep a watchful eye on the work that goes on in the standards bodies like GGF.

Gt: I haven't heard a lot of companies state their dedication to the cause of visualization capabilities on Grid networks. Can you give me a little more detail on why this is so important to SGI?

STEWART: It's important to SGI because we think it's important to the world of Grid and the world of next-generation computing. Let me give you an example. We work with an engineering firm that was working in a fairly conventional IT environment where they had a number of workstations around the company, in a few locations, and they were copying files from workstation to workstation as different people had to work on them. Those files were in the neighborhood of 200GB, and it was taking about three hours on their network to move the files. But then they came to us and said that they were going to have a problem because their next data set was going to be a terabyte in size, and that was probably going to take something in the order of 22 hours to move — that was not going to be acceptable. We said, “Well, we wouldn't worry about it anyway if we were you.”

And they said, “Why is that?”

“Because you can't load it on the workstation anyway. It'll crash it.”

Before we presented the solution to them, they came back and said they had made a mistake. It was, in fact, not going to be 1TB; it was going to be 4TB. I might say this as an aside: this is an increasingly common happening among companies and among research organizations. The spike in data is so profound.

Quite clearly in that circumstance, there was no way those workstations could cope with 4TB, nor could the company's network. So we designed a system for them that allows them to have those remote legacy workstations, have the users there, send instructions into a SAN (Storage Area Network) to cause the compute server to compute the data, then the visualization piece of the compute server visualizes that data, then we strip the pixels from the data and stream the pixels in real-time back to the remote user on the legacy workstation. They have full interactivity not only with the data set, but the computation of that data, from a legacy workstation — stressing neither the network nor the workstation's capacity.

In that circumstance, visualization is a critical tool for working with big data. More and more, as people look at these data spikes, even if you are able to get the data moved — which is becoming increasingly impossible — if the data or the data results are being expressed alpha-numerically, it's going to take you too long to read it. Ask a big data question … you get bigger data answer frequently. And if it takes you six months to read the answer …

If you can look at it visually, you often can understand it in a fraction of the time it would take you to internalize the information if it's expressed to you alpha-numerically. We believe that kind of infrastructure is critical for all sorts of users, in all sorts of places, working on all sorts of devices, with all sorts of OS's. It's SGI's role to bring that core power to Grid installations so that people at the various points along the Grid can have access to it.

Gt: Finally, I want to go back, for a few questions, to SGI's groundbreaking demonstration at Supercomputing '97. What kind of effect did it have on the Grid movement? Did it add an element of legitimacy?

STEWART: I think it certainly got Grid going and began a process of people seeing that there is a possibility to design this very different kind of infrastructure. I think that, in truth, if the community could have kept the momentum of that activity in 1997, we'd be further ahead today. I think we got sidetracked for a number of years, particularly in North America, with the cycle scavenging model as a single approach to Grid computing. I'm happy to say that single approach has very much ended.

While going around and doing cycle scavenging is still a very legitimate part of Grid, it's no longer seen as the grid. People are recognizing that Grid users should have access to a variety of different devices and a variety of different kinds of tools to work.

So, I think that 1997 was critical. I just wish we could have maintained the momentum that [the demo in] 1997 started, and we might be further ahead today. Things have changed dramatically in the last year to two years and focused much more on the building of the kind of infrastructure that's required to deal with big data.

Gt: That was almost seven-and-a-half years ago, an eternity in information technology, and a whole lot has changed with Grid since then. What do you see as some of the biggest differences?

STEWART: Grids are now deployed in working environments — there are lots of Grids. I would characterize the Grid as having three phases so far. From 1997 until about 2001, you were looking at Grids deployed for research on Grids. Starting around 2001, you increasingly saw Grids deployed in a research environment to serve research goals of multiple disciplines. We moved away from the Grid being the object of the research to the Grid being a tool to enable research. Starting roughly around late 2003-2004, we really began to see a major ramp-up of Grids being installed in enterprise situations and in corporate situations.

Certainly, I might comment with one other hat that I wear as co-chair of the Plenary Program Committee at GGF. Our Plenary Program at GGF12 in Brussels last September was quite extraordinary [in regard to] the number of companies that turned up at event that either already had Grids or were seriously looking at installing Grids and came to find out about it. There was nothing like that attendance previously. If you go back to GGF in 2002, there would be no one there but vendors and researchers. By now, we believe we are seeing a strong corporate engagement in the whole issue of Grid.

Gt: My final question is: If you had a crystal ball, what do you think you would see in another seven-and-a-half years, in 2012? Where will Grid be, and what role will SGI have in helping it get there?

STEWART: I very much see Grid in this context: Starting in the middle of the 18th century, right up well through the 20th century, we built evermore elaborate distribution mechanisms, or infrastructures for distribution, in order to move raw materials, processed materials and finished products. That was the absolute foundation of the industrial economy. We began, sometime late in the 20th century, to begin creating the infrastructure for the knowledge economy. Data is the raw material for the knowledge economy. Grid is the nascent, or the beginning, of that infrastructure that is going to allow us to move data from data to information to knowledge and, therefore, to value.

I would say that we are going to increasingly see infrastructure built around the principles that are related to Grid computing that enable users in every conceivable location to have access to the tools that they need for data. If I chose to, I could drive about a mile from my house and be able to buy a lemon picked off a tree in California. We have the infrastructure in place to make it possible for me to have that in the middle of winter in Toronto. We're going to see the kind of infrastructure that will make it possible for me, regardless of where I am, to be able to access the power to deal with the data that I need in order to be a knowledge worker.

Gt: Is there anything else you would like to add in regard to this announcement or SGI's Grid strategy in general?

STEWART: I believe that SGI will always be looking forward to your 2012 date. SGI will always be there, designing the tools that are ready to deal with that next spike in the volumes of data people have to work with. We're not going to be down at the commodity level, and we're not going to be there for the problems that are already solved. We are going to be there for the people who are tackling the next data spike.

To read the release cited in this article, please see “New Solutions Extend SGI'S Drive to Advance Grid Computing” in the issue of GRIDtoday.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion XL — were added to the benchmark suite as MLPerf continues Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing power it brings to artificial intelligence.  Nvidia's DGX Read more…

Call for Participation in Workshop on Potential NSF CISE Quantum Initiative

March 26, 2024

Editor’s Note: Next month there will be a workshop to discuss what a quantum initiative led by NSF’s Computer, Information Science and Engineering (CISE) directorate could entail. The details are posted below in a Ca Read more…

Waseda U. Researchers Reports New Quantum Algorithm for Speeding Optimization

March 25, 2024

Optimization problems cover a wide range of applications and are often cited as good candidates for quantum computing. However, the execution time for constrained combinatorial optimization applications on quantum device Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at the network layer threatens to make bigger and brawnier pro Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HBM3E memory as well as the the ability to train 1 trillion pa Read more…

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HB Read more…

Nvidia Looks to Accelerate GenAI Adoption with NIM

March 19, 2024

Today at the GPU Technology Conference, Nvidia launched a new offering aimed at helping customers quickly deploy their generative AI applications in a secure, s Read more…

The Generative AI Future Is Now, Nvidia’s Huang Says

March 19, 2024

We are in the early days of a transformative shift in how business gets done thanks to the advent of generative AI, according to Nvidia CEO and cofounder Jensen Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. Wh Read more…

Alibaba Shuts Down its Quantum Computing Effort

November 30, 2023

In case you missed it, China’s e-commerce giant Alibaba has shut down its quantum computing research effort. It’s not entirely clear what drove the change. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Leading Solution Providers

Contributors

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Google Introduces ‘Hypercomputer’ to Its AI Infrastructure

December 11, 2023

Google ran out of monikers to describe its new AI system released on December 7. Supercomputer perhaps wasn't an apt description, so it settled on Hypercomputer Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Intel Won’t Have a Xeon Max Chip with New Emerald Rapids CPU

December 14, 2023

As expected, Intel officially announced its 5th generation Xeon server chips codenamed Emerald Rapids at an event in New York City, where the focus was really o Read more…

IBM Quantum Summit: Two New QPUs, Upgraded Qiskit, 10-year Roadmap and More

December 4, 2023

IBM kicks off its annual Quantum Summit today and will announce a broad range of advances including its much-anticipated 1121-qubit Condor QPU, a smaller 133-qu Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire