Crystal Ball Gazing: IBM’s Vision for the Future of Computing

By John Russell

October 14, 2019

Dario Gil, IBM’s relatively new director of research, painted a intriguing portrait of the future of computing along with a rough idea of how IBM thinks we’ll get there at last month’s MIT-IBM Watson AI Lab’s AI Research Week held at MIT. Just as Moore’s law, now fading, was always a metric with many ingredients baked into it, Gil’s evolving post-Moore vision is a composite view with multiple components.

“We’re beginning to see an answer to what is happening at the end of Moore’s law. It’s a question that has been the front of the industry for a long, long time,” said Gil in his talk. “And the answer is that we’re going to have this new foundation of bits plus neurons plus qubits coming together, over the next decade [at] different maturity levels – bits [are] enormously mature, the world of neural networks and neural technology, next in maturity, [and] quantum the least mature of those. [It] is important to anticipate what will happen when those three things intersect within a decade.”

Dario Gil, IBM

Not by coincidence IBM Research has made big bets in all three areas. It’s neuromorphic chip (True North) and ‘analog logic’ research efforts (e.g. phase change memory) are vigorous. Given the size and scope of its IBM Q systems and Q networks, it seems likely that IBM is spending more on quantum computing than any other non-governmental organization. Lastly, of course, IBM hasn’t been shy about touting Summit and Sierra supercomputers, now ranked one and two in the world (Top500), as the state of the art in heterogeneous computing architectures suited for AI today. In fact, IBM recently donated a 2 petaflops system petaflops, Satori, to MIT that is based the Summit design and well-suited for AI and hybrid HPC-AI workloads.

Gil was promoted to director of IBM Research last February and has begun playing a more visible role. For example, he briefed HPCwire last month on IBM’s new quantum computing center. A longtime IBMer (~16 years) with a Ph.D. in electrical engineering and computer science from MIT, Gil became the 12th director of IBM Research in its storied 74-year history. That IBM Research will turn 75 in 2020 is no small feat in itself. It has about 3,000 researchers at 12 labs spread around the world with 1,500 of those researchers based at IBM’s Watson Research Center in N.Y. IBM likes to point out its research army has included six Nobel prize winners and the truth is IBM research effort dwarfs those of all but a few of the biggest companies.

In his talk at MIT, though thin on technical details for the future, Gil did a nice job of reprising recent computer technology history and current dynamics. Among other things he looked at how the basic idea of separating information – digital bits – from the things they represent and how for a long time that proved incredibly powerful in enabling computing. He then pivoted noting that ultimately nature doesn’t seem to work that way and that for many problems, as Richard Feynman famously suggested, quantum computers based on quantum bits (qubits) are required. Qubits, of course, are intimately connected to “their stuff” and behave in the probabilistic ways as nature does. (Making qubits behave nicely has proven devilishly difficult.)

Pushing beyond Moore’s law, argued Gil, will require digital bits, data-driven AI, and qubits working in collaboration. Before jumping into his talk it’s worth hearing his summary of why the pace of progress even as experienced in Moore’s law’s heyday would be a problem today. As you might guess both flops performance and energy consumption are front and center along with AI’s dramatically growing appetite for compute:

“If you look at what is the core of the issue? If you look at some very state of the art [AI] models, you can see some of the plot in terms of petaflops per day [consumed] for training from examples of recent research work [with AlexNet and AlphaGo Zero] as a function of time. One of the things we are witnessing is the compute requirement for training jobs is doubling every three and a half months. So we were very impressed, with Moore’s law, doubling every 18 months, right? This thing is doubling every three and a half months. Obviously, it’s unsustainable. If we keep at that rate for sustained periods of time we will consume every piece of energy the world has to just do this. So that’s not the right answer,” said Gil.

“There’s a dimension of [the solution] that has to do with hardware innovation and there’s another dimension that has to do with algorithmic innovation. So this is the roadmap that we have laid out in terms of the next eight years or so of how we’re going to go from Digital AI cores [CPU plus accelerators] like we have today, based on reduced precision architectures, to mixed analog-digital cores, to in the future, perhaps, entirely analog cores that implement very efficiently the multiply-accumulate function inherently in these devices as we perform training.

“Even in this scenario, which is, you know, still going to require billions of dollars of investments and a lot of talent, the best we can forecast is about 2.5x improvement per year. That’s well short of three-and-a-half months, right, of doubling computing power. We have to deliver this for sure. But the other side of the equation is the work that you all do and that is: we have got to dramatically improve the algorithmic efficiency of AI on the problems that we solve,” he said.

Gil noted, for example, that a team of MIT researchers recently developed technique for training video recognition models that is up to three times faster than current state-of-the-art methods. Their work will be presented at the upcoming International Conference on Computer Vision in South Korea and a copy of their paper (TSM: Temporal Shift Module for Efficient Video Understanding) is posted on Arxiv.org.

Top video recognition models currently use three-dimensional convolutions to encode the passage of time in a sequence of images which creates bigger, more computationally-intensive models. By mingling spatial representations of the past, present and future, the new MIT model gets a sense of time passing without explicitly representing it and greatly reduces the computational cost. According to the researchers, it normally takes about two days to train such a powerful model on a system with one GPU. They borrowed time on Summit – not a luxury many have – and using 256 nodes with a total of 1,536 GPUs, could train the model in 14 minutes (see the paper’s abstract[I] at the end of the article).

IBM has posted the video of Gil’s talk and it is fairly short (~ 30 min) and worth watching to get a flavor for IBM’s vision of the future of computing. A portion of Gil’s wide-ranging comments, lightly edited and with apologies for any garbling, and a few of his slides are presented below.

  1. CLASSICAL COMPUTING: HOW DID WE GET HERE

“We’re all very familiar with the foundational idea of the binary digit and the bit, and this sort of understanding that we can look at information abstractly. Claude Shannon advocated the separation [this] almost platonic idea of zeros and ones, to decouple them from their physical manifestation was an interesting insight. It’s actually what allowed us to, for the first time in history, to look at the world and look at images as different as this right, a punch card and DNA. [We’ve] come to appreciate that they have something in common that they’re both carriers and expressers of information.

“Now, there was another companion idea that was not theoretical in nature that was practical, and that was Moore’s law. This is the re-creation of the original plot (see slide) from Gordon Moore, when he had four data points in the 1960s and the observation that the number of transistors that you could fit by unit area was doubling every 18 months. Moore extrapolated that, and amazingly enough, that has happened right over 60 years, and not because it fell off a tree but thanks to the work of scientists and engineers. I always like to cite to just give an example of the level of global coordination in R&D that is required. $300 billion a year is what the world spends move from node to node.

Recreation the original four data points that led Intel founder Gordon Moore to postulate Moore’s law

“The result of that is we digitize the world, right? Essentially, bits have become free, and the technology is extraordinarily mature. A byproduct of all of this is that there’s a community of over 25 million software developers around the world that now have access to digital technology all over the world creating and innovating and that is why software has become so like the fabric that binds business and institutions together. So it’s very, very mature technology. We are of course pushing the limits. It turns out you need 12 atoms, magnetic atoms to store a piece of information. In the end, there is a limit of the physical properties. So we need to explore also where alternatives way to represent information in richer and a more complex way.

“We have seen a consequence of when I was talking about Moore’s law and the fact that devices did not get better after 2003 As we scaled them there were a set of architectural innovations the community responded with. One was the idea of multi-cores, right, adding more cores in a in a chip. But also [there] was the idea of accelerators of different forms, that we knew that a form of specialization in computing architecture was going to be required to be able to adapt and continue the evolution of computing.

Using Summit and Sierra as an example: “Every once in a while [it’s] useful to stop and look back at the numbers and reflect right? It is kind of mind blowing that it’s possible to build these kinds of systems with the reliability we see architecturally here is that you’re bringing this blend between large number of accelerators and a large number of CPUs. And you must create system architectures with high bandwidth interconnect, because you must keep the system utilization really, really high. So this is important, and it’s illustrative of what the future is going to be back of this combining sort of this bit and neural-based architectures.”

 

  1. AI: ALGORITHM PROGRESS & NEW HARDWARE NEEDED

“There’s been another idea that has been running for well over a century now, which is the intersection of the world of biology and information. Santiago Ramon Cajal, at the turn of 1900s, was among the first to understand that we have these structures in our brain called neurons and the linkage between these neural structures and memory and learning. It wasn’t with a whole lot more than this biological inspiration that starting in the 1940s and 50s and of course to today we saw the emergence of an artificial neural network that took loose inspiration from the brain. What has happened over the last six years, in terms of this intersection between the bit revolution and the consequence of digitizing the world and the associated computing revolution [is] we have now big enough computers to train some of these deep neural networks at scale.

“We have been able to demonstrate [in] fields that have been with us for a long time like speech recognition, and language processing have been deeply impacted by this approach. We’ve seen the accuracy of these environments really improve, but we’re still in this narrow AI domain.

“I mean, the term AI, [is] a mixed blessing, right? It’s a fascinating scientific and technological, technological endeavor. But it’s a scary term for society. And when we use the word AI, we often are speaking past each other. We mean, very different things, when we say those words. So one useful thing is to add an adjective in front of it. Where are we really today, in that a narrow form of AI has begun to work, that’s a long cry from a general form of AI being present. And we’re seeing dates here, we don’t know when that’s going to happen. You know, my joke on this when we put things like 2050 (see slide). Scientists put numbers like that is like what we’re really mean is we have no idea, right?

“So the journey is to take advantage of the capability that we have today and to push the frontier and boundary towards broader forms of AI. We are passionate advocates within IBM and the collaborations we have around bringing the strengths and the great traditions within the field of AI and bringing neuro-symbolic systems together. That as profound and as important as the advancements we are seeing in deep learning, we have to combine them with knowledge representation and forms of reasoning and bring those together so that we can build systems capable of performing more tasks and more domains.

“Importantly, as technology gets more powerful, the dimension of trust becomes more essential to fulfill the potential of these advancements and get society to adopt them. How do we build the trust layer and the whole AI process around explainability and fairness and the security of AI, and the ethics of AI, and the entire engineering lifecycle of models?In this journey of neural-symbolic AI I think it’s going to have implications at all layers of the stack.

 

  1. SEPARATING IT FROM PHYSICALITY – NOT IN QUANTUM

“In the same way that I was alluding to this intersection of mathematics and information as the world of classical bits and that biology and information gave us the inspiration for neurons, it is physics and information coming together that is giving us the world of qubits. [T]here were physicists asking questions about the world of information and it was very interesting. They would ask questions like “Is there a fundamental limit to the energy efficiency of computation?” Or “Is information processing thermodynamics reversible?” The kinds of questions only physicists would have, right?

“Looking at that world and sort of pulling at that thread and this assumption that Shannon gave us of separating information and physics – Shannon says, ‘Don’t worry about that coupling’ – they actually poke at that question as to whether that was true or not. We learned that the foundational information block, it’s actually not the bit, but something called the qubit, short for quantum bit, and that we could express some fundamental principles of physics in this representation of information. Specifically for quantum computing, three ideas – the principle of superposition, the principle of entanglement, and the idea of interference – actually have to come together for how we represent and process information with qubits.

“The reason why this matters is we know there are many classes of problems in the world of computing and the world of information that are very hard for classical computers and that in the end, we’re [classical computing] bound by things that don’t blow up exponentially in the number of variables. [A] very famous example of a thing that blows up exponentially in the number of variables is simulating nature itself. That was the original idea of Richard Feynman when he advocated the fact that we needed to build a quantum computer or a machine that behaved like nature to be able to model nature. But that’s not the only problem in the realm of mathematics. We know other problems that also have that character. Factoring is an example. The traveling salesman problem, optimization problems, there’s a whole host of problems that are intractable with classical computers, and the best we can do is approximate meet them.

“Now, quantum is not going to solve all of them. There is a subset of them that will be relevant for, but it’s the only technology that we know that alters that equation of something that becomes intractable to tractable. And what is interesting is we find ourselves in a moment, like 1944 [when we built] what is arguably the first digital programmable computer. In a similar fashion now, we built the first programmable quantum computers. This is just a recent event, it just happened in the last few years. So, in fact, in the last few years, we’ve gone from that kind of laboratory environments to build the first engineered systems that are designed for reproducible and stable operation. There’s a picture of IBM Q System One System, one that sits in Yorktown.

“What I really love about what is happening right now is you can [using IBM quantum networks], sit in front of any laptop all over the world, you can write a program now, and it takes those zeros and ones coming in from your computer. In our case we use superconducting technology, converting them to microwave pulses, about five gigahertz, travels down the cryostat with superconducting coaxial cables, these operates at 50 millikelvin. Then we’re able to perform the superposition and entanglement and interference operations in a controlled fashion on the qubits, able to get the microwave signal readout, convert it back to zeros and ones, and present an answer back. It’s a fantastic scientific and engineering tour de force.

Since we put the first system online now we have over 150,000 users who are learning how to program these quantum computers run program, there’s been over 200 scientific publications being able to generate with these environments. It’s the beginning of, I’m not going to say a new field, the field of quantum computing has been with us for a while, but it’s the beginning of a totally new community, a new paradigm of computation that is coming together. One of the things is we gave access to both a simulator and the actual hardware and now it has crossed over right now what people really want access to the real hardware to be able to solve these problems.

 

  1. TRIUMPHANT THREESOME: WHAT WILL WE DO NEXT?

“So let me bring it to a close and make an argument that finally we’re beginning to see an answer to what is happening at the end of Moore’s law. It’s a question that has been the front of the industry for a long, long time. And the answer is that we’re going to have this new foundation of bits plus neurons plus qubits coming together, over the next decade [at] different maturity levels – bits [are] enormously mature, the world of neural networks and neural technology, next in maturity, [and] quantum the least mature of those. [It] is important to anticipate what will happen when those three things intersect within a decade.”

“I think the implications you will have for intelligent, mission-critical applications for the world of business and institutions, and the possibilities to accelerate discovery are so profound. Imagine the discovery of new materials, which is going to be so important to the future of this world, in the context of global warming and so many of the challenges, we face. The ability to engineer materials is going to be at the core of that battle and look at the three scientific communities that are interested in the intersection of computation, that task.

“Historically, we’ve been very experimentally-driven in this approach of the discovery of materials. You have the classical guys, the HPC community, that has been on that journey for a long time, says we know the equations of physics, “We know we can be able to simulate things with larger and larger systems. And we’re quite good at it.” There’s been amazing accomplishments in that community. But now you have the AI community says, “Hey, excuse me, I’m going to approach it with a totally different methodology, a data-driven approach to that problem, I’m going be able to revolutionize and make an impact to discovery.” Then you have the quantum community, who says [this is the very reason] why we’re creating quantum computers. All three are right. And imagine what will happen when all three are combined. That is what is ahead for us for the next decade.”

Link to Gil presentation video: https://www.youtube.com/watch?v=2RBbw6uG94w&feature=youtu.be

[i]TSM: Temporal Shift Module for Efficient Video Understanding

Abstract

“The explosive growth in video streaming gives rise to challenges on performing video understanding at high accu- racy and low computation cost. Conventional 2D CNNs are computationally cheap but cannot capture temporal relationships; 3D CNN based methods can achieve good performance but are computationally intensive, making it expensive to deploy. In this paper, we propose a generic and effective Temporal Shift Module (TSM) that enjoys both high efficiency and high performance. Specifically, it can achieve the performance of 3D CNN but maintain 2D CNN’s complexity. TSM shifts part of the channels along the temporal dimension; thus facilitate information exchanged among neighboring frames. It can be inserted into 2D CNNs to achieve temporal modeling at zero computation and zero parameters. We also extended TSM to online setting, which enables real-time low-latency online video recognition and video object detection. TSM is accurate and efficient: it ranks the first place on the Something-Something leader- board upon publication; on Jetson Nano and Galaxy Note8, it achieves a low latency of 13ms and 35ms for online video recognition. The code is available at: https://github. com/mit-han-lab/temporal-shift-module.”

 

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, code-named Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from its predecessors, including the red-hot H100 and A100 GPUs. Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. While Nvidia may not spring to mind when thinking of the quant Read more…

2024 Winter Classic: Meet the HPE Mentors

March 18, 2024

The latest installment of the 2024 Winter Classic Studio Update Show features our interview with the HPE mentor team who introduced our student teams to the joys (and potential sorrows) of the HPL (LINPACK) and accompany Read more…

Houston We Have a Solution: Addressing the HPC and Tech Talent Gap

March 15, 2024

Generations of Houstonian teachers, counselors, and parents have either worked in the aerospace industry or know people who do - the prospect of entering the field was normalized for boys in 1969 when the Apollo 11 missi Read more…

Apple Buys DarwinAI Deepening its AI Push According to Report

March 14, 2024

Apple has purchased Canadian AI startup DarwinAI according to a Bloomberg report today. Apparently the deal was done early this year but still hasn’t been publicly announced according to the report. Apple is preparing Read more…

Survey of Rapid Training Methods for Neural Networks

March 14, 2024

Artificial neural networks are computing systems with interconnected layers that process and learn from data. During training, neural networks utilize optimization algorithms to iteratively refine their parameters until Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, code-named Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. Wh Read more…

Houston We Have a Solution: Addressing the HPC and Tech Talent Gap

March 15, 2024

Generations of Houstonian teachers, counselors, and parents have either worked in the aerospace industry or know people who do - the prospect of entering the fi Read more…

Survey of Rapid Training Methods for Neural Networks

March 14, 2024

Artificial neural networks are computing systems with interconnected layers that process and learn from data. During training, neural networks utilize optimizat Read more…

PASQAL Issues Roadmap to 10,000 Qubits in 2026 and Fault Tolerance in 2028

March 13, 2024

Paris-based PASQAL, a developer of neutral atom-based quantum computers, yesterday issued a roadmap for delivering systems with 10,000 physical qubits in 2026 a Read more…

India Is an AI Powerhouse Waiting to Happen, but Challenges Await

March 12, 2024

The Indian government is pushing full speed ahead to make the country an attractive technology base, especially in the hot fields of AI and semiconductors, but Read more…

Charles Tahan Exits National Quantum Coordination Office

March 12, 2024

(March 1, 2024) My first official day at the White House Office of Science and Technology Policy (OSTP) was June 15, 2020, during the depths of the COVID-19 loc Read more…

AI Bias In the Spotlight On International Women’s Day

March 11, 2024

What impact does AI bias have on women and girls? What can people do to increase female participation in the AI field? These are some of the questions the tech Read more…

Alibaba Shuts Down its Quantum Computing Effort

November 30, 2023

In case you missed it, China’s e-commerce giant Alibaba has shut down its quantum computing research effort. It’s not entirely clear what drove the change. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Analyst Panel Says Take the Quantum Computing Plunge Now…

November 27, 2023

Should you start exploring quantum computing? Yes, said a panel of analysts convened at Tabor Communications HPC and AI on Wall Street conference earlier this y Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Leading Solution Providers

Contributors

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Training of 1-Trillion Parameter Scientific AI Begins

November 13, 2023

A US national lab has started training a massive AI brain that could ultimately become the must-have computing resource for scientific researchers. Argonne N Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Google Introduces ‘Hypercomputer’ to Its AI Infrastructure

December 11, 2023

Google ran out of monikers to describe its new AI system released on December 7. Supercomputer perhaps wasn't an apt description, so it settled on Hypercomputer Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire