How It All Went Wrong for SGI in HPC

By Addison Snell

April 10, 2009

Do you know what the saddest part is of Rackable’s attempt to acquire SGI’s assets out of bankruptcy for a paltry $25 million in cash? There would be very little effect on the HPC market as a result. Thanks to a cascading series of problems – bad marketing, operational misfires, bad technology bets, and ordinary bad luck – SGI didn’t have much left to lose.

It’s been a long ride down for SGI, and now that it’s finally over, it comes almost as a relief, like when a beloved relative who has been suffering in pain for years with a terminal disease, a mere shadow of former self, finally passes. It gives us one last chance to light candles, join hands, and remember an icon that once exemplified Silicon Valley greatness.

The industry is full of opinions on what went wrong with SGI, and I’ve got my stories too. Before becoming an HPC industry analyst, I worked at SGI myself for a little over six years in product management and product marketing roles in SGI’s server group.

When I joined Silicon Graphics in August 1997, the company appeared on the cover of BusinessWeek magazine. No, not that cover story. People remember SGI being anointed “the Gee-Whiz Company” by BusinessWeek in 1994, but some forget that the same publication announced the company’s downfall only three years later in “The Sad Saga of Silicon Graphics.” Marketing was to blame, along with reckless spending; that much was common knowledge. On both these points, I’d have to agree.

I didn’t witness the most reckless of spending myself – Silicon Graphics’ final infamous Winterfest party, with its headliners, revelry, and raucous lip-synch contests, was just before my time – but I can tell you what marketing was doing about our position in HPC. We were killing it as fast as ever we could.

Silicon Graphics’ technology leadership in workstations and servers was widely acknowledged, but our executive leadership suffered from a collective inferiority complex when compared to Sun, Silicon Graphics’ twin sister. (Both companies were founded in the same year, with neighboring headquarters, and they competed in several markets.) Sun was flying high with enterprise servers based on SPARC and Solaris, and we wanted to catch up. The training I was given that first summer directed me to assume, as an inalienable truth, that the market didn’t care about performance. We had to beat Sun with enterprise features, with an emphasis on RAS (reliability, availability, serviceability). I was the IRIX product manager at the time. I could probably still whiteboard the “Cellular IRIX” presentation for you.

You have to give CEO Ed McCracken credit; he wanted the company to be responsive to customer needs, and he listened to marketing. Marketing just blew it. We told engineering to back off on performance and give us RAS. We made “Sun Sucks” bandit videos and rewrote the marketing literature. Instead of taking on Alpha in HPC, we took on Solaris in enterprise. We were destined to fail.

Interestingly, one of the main platforms we competed against, the Sun Ultra Enterprise 10000, was based on technology we had sold to Sun. I’m in the minority opinion in that I don’t think we should have kept the technology for ourselves. I just think we should have gotten more money for it. Probably a lot more.

I should also mention Cray at this point. We owned them, and I worked in the server group. I have to say, I really didn’t know those guys back then. Again, it’s not the purchase I had a problem with, but rather the implementation. We never integrated those teams. The eventual planned merger of the roadmaps got delayed more than a few times. As of this moment, it was slated to come to market in SGI’s Ultraviolet product, which might yet never see the light of day.

The next chapter is well-documented. McCracken was out, and our new CEO was Rick Belluzzo, who was further convinced that volume was the key to success. He put Silicon Graphics’ resources on souped-up PCs – I was still working on servers, trying to sell our customers IRIX-based systems as Belluzzo told the world we wouldn’t invest in them – and he boldly announced the plan to the world. He abruptly quit weeks later. He was not popular around the water cooler after that.

Then came Bob Bishop, and I have to tell you, things started to get better, at least with strategy and marketing. It was 1999 when he took over. Bishop loved the big systems, and he brought us back to an HPC strategy. We built excitement in our roadmap. And we made it to what was supposed to be a critical turning point: the launch of Origin 3000 at the end of June 2000.

Reckless spending? Yeah, there was some. The annual sales conference, where we did the internal launch, featured magicians Penn and Teller and motivational speaker Tony Robbins (not to be confused with namesake Anthony Robbins, who ran federal sales), among others, in a supercool tent event set up adjacent to our supercool new buildings, a campus known today as the GooglePlex. Still, it could have been worse.

Bad marketing? I have to say, the NUMAflex messaging was a little off. Not entirely my fault, but by the time everyone had their say, “NUMAflex” represented a suite of tangentially related benefits, only one of which was performance. Still, it could have been worse.

And the launch? What a success. We had enough sales already booked to make our next two quarters. That’s when different culprits got us: operations and plain old bad luck.

We’d been shipping O3Ks for only a few weeks – weren’t even at full speed yet – when we hit a part outage. It was the ceramic packaging for the chips. We had a single-source supplier, IBM, and there was no ceramic packaging to be had. We went on stop-ship, with the whole backlog of new orders waiting to be fulfilled. It took months. By the time we started the line again, a lot of those sales were gone.

No matter, we had the second punch coming, our first systems based on Itanium chips. First-generation Itanium chips. Anyone here remember Merced? Intel delayed it, then delayed it again, then pulled the plug on it. There went another batch of orders. We retooled for McKinley, also delayed.

We finally got our first Itanium and Linux-based Altix 3000 systems out at LinuxWorld in January 2002. It was a great product, and if I do say so myself, marketing was really ticking. The press was good, we won awards, and everyone knew what we were doing and what the value proposition was. It was just a tough sale.

By 2002, x86 clusters were the norm. Our standard pitches that explained why you shouldn’t convert your code to MPI were moot. The codes were MPI, and they weren’t coming back. Furthermore, Itanium had marginalized in the market – SGI picked the wrong chip. Back to the drawing board.

The rest is not as interesting, and it’s more recent, so you remember it. SGI started working on clusters. Bishop was out, replaced by Dennis McKenna, who guided the company through its first bankruptcy, then Bo Ewald. Under Ewald, SGI purchased the remains of Linux Networx and integrated the software into its own cluster stack, but it was too little too late. Customers had been hurt too many times, and they weren’t coming back.

The last five years have been a grind for SGI, and many of us – especially us former employees with purple blood – silently rooted for the company to succeed. “Not dead yet” was the constant diagnosis. Until this month. Now SGI is gone.

What happened? As with any prolonged misery, a lot of things went wrong. If you want my opinion, look at the things we did wrong, and the few we did right, in the six-year span from 1997 to 2003.

But forget about that now. Remember the good times. Remember what you loved about the Gee-Whiz Company. Tell a friend your favorite SGI story, and then listen to someone else’s. Have an SGI-style TGIF beer bash and raise a glass to a Silicon Valley icon.

And then bury it, ’cause it’s gone.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that have occurred about once a decade. With this in mind, the ISC Read more…

2024 Winter Classic: Texas Two Step

April 18, 2024

Texas Tech University. Their middle name is ‘tech’, so it’s no surprise that they’ve been fielding not one, but two teams in the last three Winter Classic cluster competitions. Their teams, dubbed Matador and Red Read more…

2024 Winter Classic: The Return of Team Fayetteville

April 18, 2024

Hailing from Fayetteville, NC, Fayetteville State University stayed under the radar in their first Winter Classic competition in 2022. Solid students for sure, but not a lot of HPC experience. All good. They didn’t Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use of Rigetti’s Novera 9-qubit QPU. The approach by a quantum Read more…

2024 Winter Classic: Meet Team Morehouse

April 17, 2024

Morehouse College? The university is well-known for their long list of illustrious graduates, the rigor of their academics, and the quality of the instruction. They were one of the first schools to sign up for the Winter Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pressing needs and hurdles to widespread AI adoption. The sudde Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that ha Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use o Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announce Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. I Read more…

Google Announces Homegrown ARM-based CPUs 

April 9, 2024

Google sprang a surprise at the ongoing Google Next Cloud conference by introducing its own ARM-based CPU called Axion, which will be offered to customers in it Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Leading Solution Providers

Contributors

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire