Visit additional Tabor Communication Publications
HPC Matters is a joint blog consisting of contributors from the Tabor Communications team on their observations and insights into HPC matters.
April 10, 2009
Do you know what the saddest part is of Rackable’s attempt to acquire SGI’s assets out of bankruptcy for a paltry $25 million in cash? There would be very little effect on the HPC market as a result. Thanks to a cascading series of problems -- bad marketing, operational misfires, bad technology bets, and ordinary bad luck -- SGI didn’t have much left to lose.
It’s been a long ride down for SGI, and now that it’s finally over, it comes almost as a relief, like when a beloved relative who has been suffering in pain for years with a terminal disease, a mere shadow of former self, finally passes. It gives us one last chance to light candles, join hands, and remember an icon that once exemplified Silicon Valley greatness.
The industry is full of opinions on what went wrong with SGI, and I’ve got my stories too. Before becoming an HPC industry analyst, I worked at SGI myself for a little over six years in product management and product marketing roles in SGI’s server group.
When I joined Silicon Graphics in August 1997, the company appeared on the cover of BusinessWeek magazine. No, not that cover story. People remember SGI being anointed “the Gee-Whiz Company” by BusinessWeek in 1994, but some forget that the same publication announced the company’s downfall only three years later in “The Sad Saga of Silicon Graphics.” Marketing was to blame, along with reckless spending; that much was common knowledge. On both these points, I’d have to agree.
I didn’t witness the most reckless of spending myself -- Silicon Graphics’ final infamous Winterfest party, with its headliners, revelry, and raucous lip-synch contests, was just before my time -- but I can tell you what marketing was doing about our position in HPC. We were killing it as fast as ever we could.
Silicon Graphics’ technology leadership in workstations and servers was widely acknowledged, but our executive leadership suffered from a collective inferiority complex when compared to Sun, Silicon Graphics’ twin sister. (Both companies were founded in the same year, with neighboring headquarters, and they competed in several markets.) Sun was flying high with enterprise servers based on SPARC and Solaris, and we wanted to catch up. The training I was given that first summer directed me to assume, as an inalienable truth, that the market didn’t care about performance. We had to beat Sun with enterprise features, with an emphasis on RAS (reliability, availability, serviceability). I was the IRIX product manager at the time. I could probably still whiteboard the “Cellular IRIX” presentation for you.
You have to give CEO Ed McCracken credit; he wanted the company to be responsive to customer needs, and he listened to marketing. Marketing just blew it. We told engineering to back off on performance and give us RAS. We made “Sun Sucks” bandit videos and rewrote the marketing literature. Instead of taking on Alpha in HPC, we took on Solaris in enterprise. We were destined to fail.
Interestingly, one of the main platforms we competed against, the Sun Ultra Enterprise 10000, was based on technology we had sold to Sun. I’m in the minority opinion in that I don’t think we should have kept the technology for ourselves. I just think we should have gotten more money for it. Probably a lot more.
I should also mention Cray at this point. We owned them, and I worked in the server group. I have to say, I really didn’t know those guys back then. Again, it’s not the purchase I had a problem with, but rather the implementation. We never integrated those teams. The eventual planned merger of the roadmaps got delayed more than a few times. As of this moment, it was slated to come to market in SGI’s Ultraviolet product, which might yet never see the light of day.
The next chapter is well-documented. McCracken was out, and our new CEO was Rick Belluzzo, who was further convinced that volume was the key to success. He put Silicon Graphics’ resources on souped-up PCs -- I was still working on servers, trying to sell our customers IRIX-based systems as Belluzzo told the world we wouldn’t invest in them -- and he boldly announced the plan to the world. He abruptly quit weeks later. He was not popular around the water cooler after that.
Then came Bob Bishop, and I have to tell you, things started to get better, at least with strategy and marketing. It was 1999 when he took over. Bishop loved the big systems, and he brought us back to an HPC strategy. We built excitement in our roadmap. And we made it to what was supposed to be a critical turning point: the launch of Origin 3000 at the end of June 2000.
Reckless spending? Yeah, there was some. The annual sales conference, where we did the internal launch, featured magicians Penn and Teller and motivational speaker Tony Robbins (not to be confused with namesake Anthony Robbins, who ran federal sales), among others, in a supercool tent event set up adjacent to our supercool new buildings, a campus known today as the GooglePlex. Still, it could have been worse.
Bad marketing? I have to say, the NUMAflex messaging was a little off. Not entirely my fault, but by the time everyone had their say, "NUMAflex" represented a suite of tangentially related benefits, only one of which was performance. Still, it could have been worse.
And the launch? What a success. We had enough sales already booked to make our next two quarters. That's when different culprits got us: operations and plain old bad luck.
We'd been shipping O3Ks for only a few weeks - weren't even at full speed yet - when we hit a part outage. It was the ceramic packaging for the chips. We had a single-source supplier, IBM, and there was no ceramic packaging to be had. We went on stop-ship, with the whole backlog of new orders waiting to be fulfilled. It took months. By the time we started the line again, a lot of those sales were gone.
No matter, we had the second punch coming, our first systems based on Itanium chips. First-generation Itanium chips. Anyone here remember Merced? Intel delayed it, then delayed it again, then pulled the plug on it. There went another batch of orders. We retooled for McKinley, also delayed.
We finally got our first Itanium and Linux-based Altix 3000 systems out at LinuxWorld in January 2002. It was a great product, and if I do say so myself, marketing was really ticking. The press was good, we won awards, and everyone knew what we were doing and what the value proposition was. It was just a tough sale.
By 2002, x86 clusters were the norm. Our standard pitches that explained why you shouldn't convert your code to MPI were moot. The codes were MPI, and they weren't coming back. Furthermore, Itanium had marginalized in the market - SGI picked the wrong chip. Back to the drawing board.
The rest is not as interesting, and it's more recent, so you remember it. SGI started working on clusters. Bishop was out, replaced by Dennis McKenna, who guided the company through its first bankruptcy, then Bo Ewald. Under Ewald, SGI purchased the remains of Linux Networx and integrated the software into its own cluster stack, but it was too little too late. Customers had been hurt too many times, and they weren't coming back.
The last five years have been a grind for SGI, and many of us - especially us former employees with purple blood - silently rooted for the company to succeed. "Not dead yet" was the constant diagnosis. Until this month. Now SGI is gone.
What happened? As with any prolonged misery, a lot of things went wrong. If you want my opinion, look at the things we did wrong, and the few we did right, in the six-year span from 1997 to 2003.
But forget about that now. Remember the good times. Remember what you loved about the Gee-Whiz Company. Tell a friend your favorite SGI story, and then listen to someone else's. Have an SGI-style TGIF beer bash and raise a glass to a Silicon Valley icon.
And then bury it, 'cause it's gone.
Posted by Addison Snell - April 10, 2009 @ 9:54 AM, Pacific Daylight Time
Addison Snell is the CEO of Intersect360 Research and a veteran of the high performance computing industry. During his tenure, he has established Intersect360 Research as a premier source of market information, analysis and consulting.
No Recent Blog Comments
Contributing commentator, Andrew Jones, offers a break in the news cycle with an assessment of what the national "size matters" contest means for the U.S. and other nations...
Today at the International Supercomputing Conference in Leipzing, Germany, Jack Dongarra presented on a proposed benchmark that could carry a bit more weight than its older Linpack companion. The high performance conjugate gradient (HPCG) concept takes into account new architectures for new applications, while shedding the floating point....
Not content to let the Tianhe-2 announcement ride alone, Intel rolled out a series of announcements around its Knights Corner and Xeon Phi products--all of which are aimed at adding some options and variety for a wider base of potential users across the HPC spectrum. Today at the International Supercomputing Conference, the company's Raj....
Jun 19, 2013 |
Supercomputer architectures have evolved considerably over the last 20 years, particularly in the number of processors that are linked together. One aspect of HPC architecture that hasn't changed is the MPI programming model.
Jun 18, 2013 |
The world's largest supercomputers, like Tianhe-2, are great at traditional, compute-intensive HPC workloads, such as simulating atomic decay or modeling tornados. But data-intensive applications--such as mining big data sets for connections--is a different sort of workload, and runs best on a different sort of computer.
Jun 18, 2013 |
Researchers are finding innovative uses for Gordon, the 285 teraflop supercomputer housed at the San Diego Supercomputer Center (SDSC) that has a unique Flash-based storage system. Since going online, researchers have put the incredibly fast I/O to use on a wide variety of workloads, ranging from chemistry to political science.
Jun 17, 2013 |
The advent of low-power mobile processors and cloud delivery models is changing the economics of computing. But just as an economy car is good at different things than a full size truck, an HPC workload still has certain computing demands that neither the fastest smartphone nor the most elastic cloud cluster can fulfill.
Jun 14, 2013 |
For all the progress we've made in IT over the last 50 years, there's one area of life that has steadfastly eluded the grasp of computers: understanding human language. Now, researchers at the Texas Advanced Computing Center (TACC) are utilizing a Hadoop cluster on its Longhorn supercomputer to move the state of the art of language processing a little bit further.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
Join HPCwire Editor Nicole Hemsoth and Dr. David Bader from Georgia Tech as they take center stage on opening night at Atlanta's first Big Data Kick Off Week, filmed in front of a live audience. Nicole and David look at the evolution of HPC, today's big data challenges, discuss real world solutions, and reveal their predictions. Exactly what does the future holds for HPC?
Join our webinar to learn how IT managers can migrate to a more resilient, flexible and scalable solution that grows with the data center. Mellanox VMS is future-proof, efficient and brings significant CAPEX and OPEX savings. The VMS is available today.