Visit additional Tabor Communication Publications
HPC Matters is a joint blog consisting of contributors from the Tabor Communications team on their observations and insights into HPC matters.
April 10, 2009
Do you know what the saddest part is of Rackable’s attempt to acquire SGI’s assets out of bankruptcy for a paltry $25 million in cash? There would be very little effect on the HPC market as a result. Thanks to a cascading series of problems -- bad marketing, operational misfires, bad technology bets, and ordinary bad luck -- SGI didn’t have much left to lose.
It’s been a long ride down for SGI, and now that it’s finally over, it comes almost as a relief, like when a beloved relative who has been suffering in pain for years with a terminal disease, a mere shadow of former self, finally passes. It gives us one last chance to light candles, join hands, and remember an icon that once exemplified Silicon Valley greatness.
The industry is full of opinions on what went wrong with SGI, and I’ve got my stories too. Before becoming an HPC industry analyst, I worked at SGI myself for a little over six years in product management and product marketing roles in SGI’s server group.
When I joined Silicon Graphics in August 1997, the company appeared on the cover of BusinessWeek magazine. No, not that cover story. People remember SGI being anointed “the Gee-Whiz Company” by BusinessWeek in 1994, but some forget that the same publication announced the company’s downfall only three years later in “The Sad Saga of Silicon Graphics.” Marketing was to blame, along with reckless spending; that much was common knowledge. On both these points, I’d have to agree.
I didn’t witness the most reckless of spending myself -- Silicon Graphics’ final infamous Winterfest party, with its headliners, revelry, and raucous lip-synch contests, was just before my time -- but I can tell you what marketing was doing about our position in HPC. We were killing it as fast as ever we could.
Silicon Graphics’ technology leadership in workstations and servers was widely acknowledged, but our executive leadership suffered from a collective inferiority complex when compared to Sun, Silicon Graphics’ twin sister. (Both companies were founded in the same year, with neighboring headquarters, and they competed in several markets.) Sun was flying high with enterprise servers based on SPARC and Solaris, and we wanted to catch up. The training I was given that first summer directed me to assume, as an inalienable truth, that the market didn’t care about performance. We had to beat Sun with enterprise features, with an emphasis on RAS (reliability, availability, serviceability). I was the IRIX product manager at the time. I could probably still whiteboard the “Cellular IRIX” presentation for you.
You have to give CEO Ed McCracken credit; he wanted the company to be responsive to customer needs, and he listened to marketing. Marketing just blew it. We told engineering to back off on performance and give us RAS. We made “Sun Sucks” bandit videos and rewrote the marketing literature. Instead of taking on Alpha in HPC, we took on Solaris in enterprise. We were destined to fail.
Interestingly, one of the main platforms we competed against, the Sun Ultra Enterprise 10000, was based on technology we had sold to Sun. I’m in the minority opinion in that I don’t think we should have kept the technology for ourselves. I just think we should have gotten more money for it. Probably a lot more.
I should also mention Cray at this point. We owned them, and I worked in the server group. I have to say, I really didn’t know those guys back then. Again, it’s not the purchase I had a problem with, but rather the implementation. We never integrated those teams. The eventual planned merger of the roadmaps got delayed more than a few times. As of this moment, it was slated to come to market in SGI’s Ultraviolet product, which might yet never see the light of day.
The next chapter is well-documented. McCracken was out, and our new CEO was Rick Belluzzo, who was further convinced that volume was the key to success. He put Silicon Graphics’ resources on souped-up PCs -- I was still working on servers, trying to sell our customers IRIX-based systems as Belluzzo told the world we wouldn’t invest in them -- and he boldly announced the plan to the world. He abruptly quit weeks later. He was not popular around the water cooler after that.
Then came Bob Bishop, and I have to tell you, things started to get better, at least with strategy and marketing. It was 1999 when he took over. Bishop loved the big systems, and he brought us back to an HPC strategy. We built excitement in our roadmap. And we made it to what was supposed to be a critical turning point: the launch of Origin 3000 at the end of June 2000.
Reckless spending? Yeah, there was some. The annual sales conference, where we did the internal launch, featured magicians Penn and Teller and motivational speaker Tony Robbins (not to be confused with namesake Anthony Robbins, who ran federal sales), among others, in a supercool tent event set up adjacent to our supercool new buildings, a campus known today as the GooglePlex. Still, it could have been worse.
Bad marketing? I have to say, the NUMAflex messaging was a little off. Not entirely my fault, but by the time everyone had their say, "NUMAflex" represented a suite of tangentially related benefits, only one of which was performance. Still, it could have been worse.
And the launch? What a success. We had enough sales already booked to make our next two quarters. That's when different culprits got us: operations and plain old bad luck.
We'd been shipping O3Ks for only a few weeks - weren't even at full speed yet - when we hit a part outage. It was the ceramic packaging for the chips. We had a single-source supplier, IBM, and there was no ceramic packaging to be had. We went on stop-ship, with the whole backlog of new orders waiting to be fulfilled. It took months. By the time we started the line again, a lot of those sales were gone.
No matter, we had the second punch coming, our first systems based on Itanium chips. First-generation Itanium chips. Anyone here remember Merced? Intel delayed it, then delayed it again, then pulled the plug on it. There went another batch of orders. We retooled for McKinley, also delayed.
We finally got our first Itanium and Linux-based Altix 3000 systems out at LinuxWorld in January 2002. It was a great product, and if I do say so myself, marketing was really ticking. The press was good, we won awards, and everyone knew what we were doing and what the value proposition was. It was just a tough sale.
By 2002, x86 clusters were the norm. Our standard pitches that explained why you shouldn't convert your code to MPI were moot. The codes were MPI, and they weren't coming back. Furthermore, Itanium had marginalized in the market - SGI picked the wrong chip. Back to the drawing board.
The rest is not as interesting, and it's more recent, so you remember it. SGI started working on clusters. Bishop was out, replaced by Dennis McKenna, who guided the company through its first bankruptcy, then Bo Ewald. Under Ewald, SGI purchased the remains of Linux Networx and integrated the software into its own cluster stack, but it was too little too late. Customers had been hurt too many times, and they weren't coming back.
The last five years have been a grind for SGI, and many of us - especially us former employees with purple blood - silently rooted for the company to succeed. "Not dead yet" was the constant diagnosis. Until this month. Now SGI is gone.
What happened? As with any prolonged misery, a lot of things went wrong. If you want my opinion, look at the things we did wrong, and the few we did right, in the six-year span from 1997 to 2003.
But forget about that now. Remember the good times. Remember what you loved about the Gee-Whiz Company. Tell a friend your favorite SGI story, and then listen to someone else's. Have an SGI-style TGIF beer bash and raise a glass to a Silicon Valley icon.
And then bury it, 'cause it's gone.
Posted by Addison Snell - April 10, 2009 @ 9:54 AM, Pacific Daylight Time
Addison Snell is the CEO of Intersect360 Research and a veteran of the high performance computing industry. During his tenure, he has established Intersect360 Research as a premier source of market information, analysis and consulting.
No Recent Blog Comments
The Xeon Phi coprocessor might be the new kid on the high performance block, but out of all first-rate kickers of the Intel tires, the Texas Advanced Computing Center (TACC) got the first real jab with its new top ten Stampede system.We talk with the center's Karl Schultz about the challenges of programming for Phi--but more specifically, the optimization...
Although Horst Simon was named Deputy Director of Lawrence Berkeley National Laboratory, he maintains his strong ties to the scientific computing community as an editor of the TOP500 list and as an invited speaker at conferences.
Supercomputing veteran, Bo Ewald, has been neck-deep in bleeding edge system development since his twelve-year stint at Cray Research back in the mid-1980s, which was followed by his tenure at large organizations like SGI and startups, including Scale Eight Corporation and Linux Networx. He has put his weight behind quantum company....
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
May 08, 2013 |
For engineers looking to leverage high-performance computing, the accessibility of a cloud-based approach is a powerful draw, but there are costs that may not be readily apparent.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.