Handicapping IBM/OpenPOWER’s Odds for Success

By John Russell

January 19, 2016

2016 promises to be pivotal in the IBM/OpenPOWER effort to claim a non-trivial chunk of the Intel-dominated high-end server landscape. Big Blue’s stated goal of 20-to-30 percent market share is huge. Intel currently enjoys 90-plus percent share and has seemed virtually unassailable. In an ironic twist of the old mantra ‘no one ever got fired buying IBM’ it could be that careers at Big Blue rise or fall based upon progress.

It’s just two years since (Dec. 2013) IBM (NYSE: IBM), NVIDIA (NASDAQ: NVDA), Mellanoxn (NASDAQ: MLNX), Tyan, and Google (NASDAQ: GOOG) co-founded the OpenPOWER Foundation to build an ecosystem around the IBM Power processor and challenge Intel. At virtually the same time, IBM announced plans to jettison the remainder of its x86 business (servers) by selling it to Lenovo, which had already acquired IBM’s PC business (2005). The $2.1billion deal closed late in the year. Then IBM’s share of the HPC server market was roughly 23 percent. Today, it’s closer to five percent.[i]

IBM is making a staggering bet. Setting risk aside, much has been accomplished. OpenPOWER has grown to more than 170 members in more than 22 countries. A licensable reference architecture processor has been created. Acceleration enabling technologies have been aggressively incorporated. On the order of 25 OpenPOWER solutions are in various stages of being brought to market.

“The timing is right,” says Addison Snell, CEO of Inersect360 Research. “After roughly 20 years of clusters based on the ‘Beowulf’ model, in which standardization and portability were primary goals, the HPC industry is migrating back toward an era of specialization. Even within the envelope of Intel x86 innovation, end users are looking at three primary options, Xeon, Xeon Phi as a co-processor, and Xeon Phi as a standalone microprocessor. And that’s before considering whether FPGAs acquired from Altera or even Intel Atom processors (competing with ARM) are part of the equation. End users are already evaluating a multitude of processing alternatives, which gives OpenPOWER an opportunity.”

For so long Intel’s x86 architecture has basically owned the market. It dwarfs everyone else. The entry of IBM and OpenPOWER sets up a potentially grand struggle between two contrasting views of technology progress approaches and business opportunity. Both agree the age of accelerated/manycore computing is here, but differ fundamentally on the path forward.

IBM argues Intel’s one-size fits all approach – consolidating devices and functions into a ‘single’ piece of silicon – actually stifles innovation compared to an ecosystem in which collaboration between diverse technology partners are all beavering away on their own unique ideas for delivering the best technology solutions (acceleration, networking, storage, programming, et al.).

Intel’s position is that Moore’s law is hardly dead. In fact, the company says Moore’s law and HPC form a virtuous circle, each powering the other forward (See HPCwire article, Moore’s Law – Not Dead – and Intel’s Use of HPC to Keep it Alive). Moreover, Intel contends the coalescing of functions on silicon is not merely more elegant, but ultimately higher performing and cheaper.

Brad McCredie, vice president of IBM Power Systems Development and until recently president of the OpenPOWER Foundation, says “The appetite for compute and acceleration is going to far outstrip [silicon scaling] before we’re going to say the accelerator is going to go by way of the Southbridge and Northbridge switch chip which all got sucked into the CPU die.” He further suggests that Intel’s manufacturing business model actually requires this on-silicon consolidation and a “closed system” approach to grow profits and rebuff competition.

No doubt the constant anti-Intel drumming emanating from IBM is intended to reinforce the idea that another choice in the market would be good, that Intel’s overwhelming dominance is bad, and that IBM cum partners has sufficient strength and technology acumen to mount such a challenge. Skeptics respond IBM has no other realistic route given Intel’s head start in the high-end server market and dominance in processors. Maybe it doesn’t matter. This is capitalism after all.

IBM’s Ken King

Much more interesting and important is how the struggle eventually plays out. Ken King, a 30-year-plus IBM veteran and general manager, OpenPOWER Alliances, and McCredie recently laid out the IBM strategy in a meeting with HPCwire editors. Discussion covered IBM’s embrace of the accelerated computing paradigm, its view of how high-end server market dynamics, particularly technology consumption patterns, are changing, and Big Blue’s strategy for reinventing itself and challenging Intel dominance.

Getting Moore’s Law Off Life Support?
“People say Moore’s law is dead. The facts are it’s declining,” says King. “You are no longer seeing the 2x gains every 18 months so you’re not going to get the value from just the silicon. From our perspective the biggest factor that is going to address that [challenge] is accelerators. We see accelerated computing as the new normal – the ability to effectively integrate CPUs with GPUs and FPGAs to accelerate processing throughout the entire system (networking, storage, etc) and with an emphasis on processing data where it resides versus having to move the data to the compute.”

This diverse and widespread implementation of acceleration technology is what’s critical to improving performance and putting Moore’s law back on that trajectory in a way that’s not just pure silicon, says King, adding “that’s the critical infrastructure for tomorrow’s economy.”

Cognitive computing will be the driver. “We moved from the internet era to the early stages of the cloud era – there’s still a lot to go – but the next era, just starting to formulate, is the cognitive era where industries will be transformed by being able to leverage cognitive computing to change their business model,” he says.

 Data – lots of it – is the fuel, says King. Science, industry, government, and virtually every other segment of society, are generating a treasure trove of data that cognitive computing can transform into insight and action. Acceleration is the engine, says King, citing two examples in medical applications that use different acceleration technologies:IBM Watson. SC15

  • IBM Watson Medical Health. Recently accelerated with GPUs, the IBM Watson cognitive platform has accelerated ‘rank and tree retrieval’ capabilities nearly 2X versus non-accelerated computers. Expectations are high for Watson Medical Health, already used extensively in sifting and interpreting clinical records and genomics research.
  • Edico Genome. DNA sequencing is notoriously tough on general purpose CPUs. Edicon’s FPGA-accelerated Dragon Processor board has been put into use at The Genome Analysis Center (TGAC) and was able to map the ash tree genome was 177 times faster per processing core than TGAC’s local HPC systems requiring only seven minutes instead of three hours on one of the larger datasets (see HPCwire article, TGAC Unleashes DRAGEN to Accelerate Genomics Workflows).

“I can go industry by industry showing how cognitive computing assisted by accelerated computing infrastructure will be transformative. Silicon is not going to do it by itself anymore,” says King.

Importantly, says McCredie, the right approach to acceleration will vary. “Genomics is looking good with FPGAs but it is going to be hard to argue that GPUs aren’t the way to go for deep learning. If you look at machine learning, that [also] has some pretty good power performance opportunities for FPGAs.”

If accelerated computing does end up requiring flexible approaches to satisfy varying cost/performance issues, OpenPOWER has taken steps to assemble needed technologies. GPU pioneer NVIDIA, of course, is an OpenPOWER founding member as is high performance interconnect specialist Mellanox. Last November, FPGA supplier Xillinx (NASDAQ: XLNX) joined OpenPOWER and contracted to a multi-year deal with IBM. In December, FPGA board specialist BittWare joined OpenPOWER.

IBM's Brad McCredie
IBM’s Brad McCredie

McCredie snipes, “You could argue Intel has figured this out too and endorsed it by their $16.7B acquisition of Altera, but it’s a different model. They are integrating Altera in a way where it is going to be a one size fits all approach.” That won’t work well moving forward, he argues, “Now, we are going to have to build systems with this or that kind of accelerator best suited (cost/performance) to the use…[but] I will take everything I just said back if there is disruptive technology.”

Snell says, particularly in the traditional HPC market, “The biggest advantage of OpenPOWER is its lead in accelerated computing, thanks to NVIDIA Tesla and CUDA. Another recent Intersect360 Research study showed that 34 of the top 50 HPC applications currently offer some level of GPU support or acceleration.

“The biggest open question is how this will evolve. Can end users continue to leverage their work on NVIDIA GPUs on future generations of Intel-based servers? How would technologies like CAPI and NVLINK get incorporated? If Intel does not incorporate these technologies in some optimized fashion, it could push end users onto OpenPOWER to protect their GPU investments.”

HPC Market Undergoes Redefinition
Leaving the sudden emergence of disruptive technology aside and assuming moderate technical comparability between the two camps’ products, IBM’s and OpenPOWER’s remaining hurdle is executing a successful go-to-market strategy: Who is going to build to the OpenPOWER spec – besides IBM – and source IBM Power8 processors? Who is going to buy the systems? To what extent will homegrown components and systems from China become a competitive wildcard?

IBM has certainly tried to think things through here, and articulated a crystallizing view of a market that is more nuanced and dynamic. There will be increasing overlap among traditional buyers and sellers, says King, as technology consumptions models shift. (In particular, think hyperscale datacenters, ODMs, and even big vertical players such in financial services.)

Today, Big Blue breaks the high-end server market into three distinct pieces – traditional HPC, hyperscale datacenter providers, and large enterprise verticals (financial service, for example). A major differentiator among them, emphasizes McCredie, is their varying technology ‘consumption” models which in turn influence the sales channels preferences and product configurations sought.

“The consumption model is so heavily tied to the particular set of skills you’ve invested in and developed over time,” says McCredie. “If you look at the skills the ‘hyperscales’ have invested in and developed, they are able to consume and like to consume differently than the classic enterprise whose skills evolved differently and HPC as well; one is programming-rich capable, one is admin-rich capable, and one is actually pretty technology capable. They all consume differently.”

Looking back, says McCredie, “Nobody ever came to us and said you guys don’t have good technology. We hear a lot of things; we don’t ever hear that. But our technology, until we did OpenPOWER, was completely unconsumable by important segments of the market.”

IBM has been aggressively adapting to make Power-based products easier to consume. “It wasn’t like I had to go back and redesign chips in the hyperscale market. We did have to go back and make a new open firmware stack, they weren’t going to take a half a billion lines of firmware, 99 percent of which they didn’t give a hoot about. So we did make a new firmware stack and we did create some new technology but mostly we just shifted how it was consumed,” says McCredie.

King adds quickly, “Google and Rackspace (NYSE: RAX) are eating that up.”

By its nature the OpenPOWER ecosystem should provide needed flexibility to satisfy varying consumption models. Core technology providers – IBM, NVIDIA, Mellanox, Xillinx, etc. – collaborate closely to push device performance and interoperability. Systems suppliers – OEMs, ODMs, and even a few big users can build systems according to needs dictated by their target markets or internal needs.

OpenPOWERinfographics-compliance3

“We want 20-30 percent market share. That’s a significant statement,” says King. “You’ve got the hyperscalers and we have to get a significant portion of those.”

No doubt, agrees Snell, “The hyperscale is a major wildcard here. Initiatives like Open Compute Project and Scorpio (“Beiji”) have been very inclusive of OpenPOWER and GPU computing, and some individual companies such as Google, Facebook (NASDAQ: FB), Microsoft (NASDAQ: MSFT), and Baidu (NASDAQ: BIDU) purchase enough infrastructure to set a market by themselves. (To get a sense of the market forces at play, note that both OCP and Scorpio have separately, and distinctly, redefined the rack height specified in a “U.”) If the hyperscale market demands a particular configuration, it will get it.”

IBM is having direct interactions with hyperscales says King, “Some are happy to buy IBM’s LC line, maybe with some tweaks or maybe not. Others we’re going to design a model with them based on industry benchmarking and workload benchmarking and go to an ODM. Some will go even further and design everything and just tell the ODM what to manufacturer.”

The point, says King, is the model is flexible to enable that level of customization where required. “To deploy in volume is what’s critical. We’ve got to get penetration to a point where any counterattacks by our competitors don’t negatively impact our ability to be able to get to that level of market share that we are looking for,” he says.

That’s a tall order. One could argue the big hyperscalers have a bit more freedom to do as they will. Big OEMs and ODMs are more deeply entrenched in the x86 ecosystem and risk alienating Intel. Most have made the most tepid of public comments regarding OpenPOWER which can be neatly distilled down as: “Well, we’re always evaluating technology options; however we have a great relationship with Intel.”

Intel is the big dog and worthy of fear. It has been mostly silent on the IBM and OpenPOWER challenge – there’s really no upside for public bashing. Conversely, Intel has a reputation for never being afraid of a little customer arm-twisting with regard to supply, pricing, and early access to emerging Intel technology.

Waiting for the BIG Deals
To date, IBM has achieved its initial goals with OpenPOWER. It has gained substantial market awareness, built out a robust stable of consortium members, and landed a yoke of high-profile wins with CORAL, says Snell. The next step is actually winning market. “Intersect360 Research is presently conducting a deep-dive assessment of end user evaluation and impressions of the full panoply of processing alternatives, including POWER, GPU, Xeon, Xeon Phi, and others, and we will additionally gauge market penetration in our 2016 HPC Site Census survey. 20 percent to 30 percent is a lofty goal, and it will take time to see how long it will take to approach it, if IBM can at all,” Snell says.

The wait to see critical customer wins won’t be long, says King. IBM is actively engaged with 10-15 hyperscalers, he says. “It takes awhile for a hyperscale, whose got 98 to 100 percent of their datacenter being x86, to make a strategic change to add another platform in volume in their datacenters. A year ago I would have said we are trying to get the hyperscales interested; now they are all engaged, not just interested, engaged and actually working with us to figure out what are the right workloads to put Power on and when do they start that deployment and what’s their model for deployment or consumption. I can tell you who has an ODM ready, who doesn’t, who’s going to buy directly, so definitely significant progress.”

In the enterprise, King says very big companies are also looking at different consumption models. “Not exactly what the hyperscales are doing but some that are part of the open compute community are starting look at if there is something similar they would do to the hyperscale community. That could be an interesting OpenPOWER market, besides just buying servers directly from IBM or our partners.”

King and McCredie say there are at least five to seven large enterprises looking at consuming OpenPOWER; several have Power systems inside now, but they are all also starting to stand up their own clouds. “What’s amazing is they are realizing, which is not a big secret in the industry, they are all competing against the big Internet datacenters and hyperscale guys in one way or another,” says King.

CORAL DOE graphicIn the traditional HPC-consuming world, IBM’s strategy sounds like most of its brethren which can be boiled down to: The Top500 and Linpack shouldn’t drive product development and is a poor overall metric; that said establishing one’s place in the Top500 is important because it’s still closely watched by important buyers in government, academia, etc.

“We look at the success we had on CORAL and it’s because we did a lot of great work on real workloads not just a Linpack bid. On the other hand the world is right now starting to get competitive and the U.S. lock on the Top500 just isn’t there. You’ve got to go fix that and I think we have to help people fix that.”

One point Snell makes shouldn’t be forgotten: even if IBM is successful achieving its 20-30 percent market share goal by the end of the decade – an immense achievement for sure – “Intel would still have a dominant market share, while having successfully moved up the value chain with the incorporation of more technologies into its Scalable System Framework approach, and Intel could rebuild share from that position of strength.

“In the near term (2016, 2017), OpenPOWER should focus on its assets, particularly its leadership in GPU acceleration and data-centric computing. This battle will be played out in software more than in hardware, and OpenPOWER needs to build as much momentum as it can. IBM will need to see volume market penetration beginning in 2016, coupled with a few more high-profile wins, in order to be on track.”

UPDATED, Jan 20: IBM released its full year and latest quarterly results after this article was posted. Big Blue beat consensus analysts forecasts for earnings but revenue slipped. Here’s an excerpt from IBM’s press release:

“We continue to make significant progress in our transformation to higher value. In 2015, our strategic imperatives of cloud, analytics, mobile, social and security grew 26 percent to $29 billion and now represent 35 percent of our total revenue,” said Ginni Rometty, IBM chairman, president and chief executive officer.  “We strengthened our existing portfolio while investing aggressively in new opportunities like Watson Health, Watson Internet of Things and hybrid cloud.  As we transform to a cognitive solutions and cloud platform company, we are well positioned to continue delivering greater value to our clients and returning capital to our shareholders.”

Fourth-quarter net income from continuing operations was $4.5 billion compared with $5.5 billion in the fourth quarter of 2014, down 19 percent.  Operating (non-GAAP) net income was $4.7 billion compared with $5.8 billion in the fourth quarter of 2014, down 19 percent.  The prior-year gain from the divestiture of the System x business impacted operating net income by 19 points.

Total revenues from continuing operations for the fourth quarter of 2015 of $22.1 billion were down 9 percent (down 2 percent adjusting for currency) from the fourth quarter of 2014.” For the full results see: http://www.hpcwire.com/off-the-wire/24279/

Initial reaction in the media was mixed as indicated here:

Forbes.com: IBM Finally Beats Earnings Consensus Again In Q4, But Has It Turned A Corner?
Wall Street Journal.com: IBM Revenue Slides, but Cloud Business Grows
New York Times.com: IBM Reports Declines in Fourth-Quarter Profit and Revenue Despite Gains in New Fields

[i] IDC HPC Update presented at SC15

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Aurora AI-Driven Atmosphere Model is 5,000x Faster Than Traditional Systems

July 16, 2024

While the onset of human-driven climate change brings with it many horrors, the increase in the frequency and strength of storms poses an enormous threat to communities across the globe. As climate change is warming ocea Read more…

Researchers Say Memory Bandwidth and NVLink Speeds in Hopper Not So Simple

July 15, 2024

Researchers measured the real-world bandwidth of Nvidia's Grace Hopper superchip, with the chip-to-chip interconnect results falling well short of theoretical claims. A paper published on July 10 by researchers in the U. Read more…

Belt-Tightening in Store for Most Federal FY25 Science Budets

July 15, 2024

If it’s summer, it’s federal budgeting time, not to mention an election year as well. There’s an excellent summary of the curent state of FY25 efforts reported in AIP’s policy FYI: Science Policy News. Belt-tight Read more…

Peter Shor Wins IEEE 2025 Shannon Award

July 15, 2024

Peter Shor, the MIT mathematician whose ‘Shor’s algorithm’ sent shivers of fear through the encryption community and helped galvanize ongoing efforts to build quantum computers, has been named the 2025 winner of th Read more…

Weekly Wire Roundup: July 8-July 12, 2024

July 12, 2024

HPC news can get pretty sleepy in June and July, but this week saw a bump in activity midweek as Americans realized they still had work to do after the previous holiday weekend. The world outside the United States also s Read more…

Nvidia, Intel not Welcomed in New Apple AI and HPC Development Tools

July 12, 2024

New Mac developer tools will leverage Apple's homegrown chips, limiting HPC users' ability to use parallel programming frameworks from Intel or Nvidia. Apple's latest programming framework, Xcode 16, was introduced at Read more…

Aurora AI-Driven Atmosphere Model is 5,000x Faster Than Traditional Systems

July 16, 2024

While the onset of human-driven climate change brings with it many horrors, the increase in the frequency and strength of storms poses an enormous threat to com Read more…

Shutterstock 1886124835

Researchers Say Memory Bandwidth and NVLink Speeds in Hopper Not So Simple

July 15, 2024

Researchers measured the real-world bandwidth of Nvidia's Grace Hopper superchip, with the chip-to-chip interconnect results falling well short of theoretical c Read more…

Shutterstock 2203611339

NSF Issues Next Solicitation and More Detail on National Quantum Virtual Laboratory

July 10, 2024

After percolating for roughly a year, NSF has issued the next solicitation for the National Quantum Virtual Lab program — this one focused on design and imple Read more…

NCSA’s SEAS Team Keeps APACE of AlphaFold2

July 9, 2024

High-performance computing (HPC) can often be challenging for researchers to use because it requires expertise in working with large datasets, scaling the softw Read more…

Anders Jensen on Europe’s Plan for AI-optimized Supercomputers, Welcoming the UK, and More

July 8, 2024

The recent ISC24 conference in Hamburg showcased LUMI and other leadership-class supercomputers co-funded by the EuroHPC Joint Undertaking (JU), including three Read more…

Generative AI to Account for 1.5% of World’s Power Consumption by 2029

July 8, 2024

Generative AI will take on a larger chunk of the world's power consumption to keep up with the hefty hardware requirements to run applications. "AI chips repres Read more…

US Senators Propose $32 Billion in Annual AI Spending, but Critics Remain Unconvinced

July 5, 2024

Senate leader, Chuck Schumer, and three colleagues want the US government to spend at least $32 billion annually by 2026 for non-defense related AI systems.  T Read more…

Point and Click HPC: High-Performance Desktops

July 3, 2024

Recently, an interesting paper appeared on Arvix called Use Cases for High-Performance Research Desktops. To be clear, the term desktop in this context does not Read more…

Atos Outlines Plans to Get Acquired, and a Path Forward

May 21, 2024

Atos – via its subsidiary Eviden – is the second major supercomputer maker outside of HPE, while others have largely dropped out. The lack of integrators and Atos' financial turmoil have the HPC market worried. If Atos goes under, HPE will be the only major option for building large-scale systems. Read more…

Everyone Except Nvidia Forms Ultra Accelerator Link (UALink) Consortium

May 30, 2024

Consider the GPU. An island of SIMD greatness that makes light work of matrix math. Originally designed to rapidly paint dots on a computer monitor, it was then Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock_1687123447

Nvidia Economics: Make $5-$7 for Every $1 Spent on GPUs

June 30, 2024

Nvidia is saying that companies could make $5 to $7 for every $1 invested in GPUs over a four-year period. Customers are investing billions in new Nvidia hardwa Read more…

Nvidia Shipped 3.76 Million Data-center GPUs in 2023, According to Study

June 10, 2024

Nvidia had an explosive 2023 in data-center GPU shipments, which totaled roughly 3.76 million units, according to a study conducted by semiconductor analyst fir Read more…

AMD Clears Up Messy GPU Roadmap, Upgrades Chips Annually

June 3, 2024

In the world of AI, there's a desperate search for an alternative to Nvidia's GPUs, and AMD is stepping up to the plate. AMD detailed its updated GPU roadmap, w Read more…

Some Reasons Why Aurora Didn’t Take First Place in the Top500 List

May 15, 2024

The makers of the Aurora supercomputer, which is housed at the Argonne National Laboratory, gave some reasons why the system didn't make the top spot on the Top Read more…

Intel’s Next-gen Falcon Shores Coming Out in Late 2025 

April 30, 2024

It's a long wait for customers hanging on for Intel's next-generation GPU, Falcon Shores, which will be released in late 2025.  "Then we have a rich, a very Read more…

Leading Solution Providers

Contributors

Google Announces Sixth-generation AI Chip, a TPU Called Trillium

May 17, 2024

On Tuesday May 14th, Google announced its sixth-generation TPU (tensor processing unit) called Trillium.  The chip, essentially a TPU v6, is the company's l Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

IonQ Plots Path to Commercial (Quantum) Advantage

July 2, 2024

IonQ, the trapped ion quantum computing specialist, delivered a progress report last week firming up 2024/25 product goals and reviewing its technology roadmap. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

The NASA Black Hole Plunge

May 7, 2024

We have all thought about it. No one has done it, but now, thanks to HPC, we see what it looks like. Hold on to your feet because NASA has released videos of wh Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire