The Present and Future of AI: A Discussion with HPC Visionary Dr. Eng Lim Goh

By Todd R. Weiss

November 27, 2020

As HPE’s chief technology officer for artificial intelligence, Dr. Eng Lim Goh devotes much of his time talking and consulting with enterprise customers about how AI can benefit their business operations and products.

As the start of 2021 approaches, HPCwire sister publication EnterpriseAI spoke with Goh in a telephone interview to learn about his impressions and expectations for the still-developing technology as it continues to be used by HPE’s customers.

Goh, who is widely-known as one of the leading HPC visionaries today, has a deep professional background in AI and HPC. He was CTO for most of his 27 years at Silicon Graphics before joining HPE in 2016 after the company was acquired by HPE. He has co-invented blockchain-based swarm learning applications, overseen the deployment of AI for Formula 1 auto racing, and has co-designed the systems architecture for simulating a biologically detailed mammalian brain. He has been named twice, in 2005 and 2015, to HPCwire’s “People to Watch” list, for his work. A Shell Cambridge University Scholar, he completed his PhD research and dissertation on parallel architectures and computer graphics, and holds a first-class honors degree in mechanical engineering from Birmingham University in the U.K.

This interview (which first appeared on sister website EnterpriseAI) is edited for clarity and brevity.

EnterpriseAI: Is the development of AI today where you thought it would be when it comes to enterprise use of the technology? Or do we still have a way to go before it becomes more important in enterprises?

Dr. Eng Lim Goh: You do see research with companies and industries. Some are deploying AI in a very advanced way now, while others are moving from their proof of concept to production. I think it comes down to a number of factors, including which category they are in – are they coping with making decisions manually, or are they coping with writing rules into computer programs to help them automate some of the decision making? If they are coping, then there is less of an incentive to move to using machine learning and deep neural networks, other than being concerned that competition is doing that and they will out-compete them.

There are some industries that that are still making decisions manually or writing rules to automate some of that. There are others where the amount of data to be considered to make an even better decision would be insurmountable with manual decision making and manual analytics. If you asked me a few years back where things would be, I would have been conservative on one hand and also very optimistic on the other hand, depending on companies and industries.

EnterpriseAI: Are we at the beginning of AI’s capabilities for business, or are we reaching the realities of what it can and can’t do? Has its maturity arrived?

Goh: For some users it is maturing, if you are focused on how the machine wants to help you in decision support, or in some cases, to help you take over some decision-making. That decision is very specific in an area, and you to have enough data for it. I think things are getting very advanced now.

EnterpriseAI: What are AI’s biggest technology needs to help it further solve business problems and help grow the use of AI in enterprises? Are there features and improvements that still must arrive to help deliver AI for industries, manufacturing and more?

Goh: At HPE, we spend a lot of our energy working with customers, deploying their machine learning, artificial intelligence and data analytics solutions. That’s what we focus on, the use cases. Other bigger internet companies focus more on the fundamentals of making AI more advanced. We spend more of our energy in the application of it. From the application point of view, some customer use cases are similar, but it’s interesting that a lot of times, the needs are in best practices.

In the best practices, a lot of times for example, proof of concepts succeed, but then they fail in their deployment into production. A lot of times, proof of concepts fail because of reasons other than the concept being a failure. A discipline, like engineering, over years, over decades, develops into a discipline, like computer engineering or programming. And over the years, these develop into disciplines where there are certain sets of best practices that people follow. In the practice of artificial intelligence, this will also develop. That’s part of the reason why we develop sets of best practices. First, to get from proof of concept to successful deployment, which is where we see a lot of our customers right now. We have one Fortune 500 customer, a large industrial customer, where the CTO/CIO invested in 50 proof of concepts for AI. We were called in to help, to provide guidance as to how to pick from these proof of concepts.

A lot of times they like to test to see if, for a particular use case, does it make sense to apply machine learning in decision support? Then they will invest in a small team, give them funding and get them going. So you see companies doing proof of concepts, like a medium-sized company doing one or two proof of concepts. The key, when I’m brought into to do a workshop with them on this in transitioning from proof of concept to deployment, is to look at the best practices we’ve gathered over the use cases we’ve done over the years.

One lesson is not to say that the proof of concept is successful until you also prove that you can scale it. You have to address the scale question at the beginning. One example is that if you prove that 100 cameras work for facial recognition within certain performance thresholds, it doesn’t mean the same concept will work for 100,000 cameras. You have to think through whether what you are implementing can actually scale. This is just one of the different best practices that we saw over time.

Another best practice is that this AI, when deployed, you must plug into the existing workflow in a seamless way, so the user doesn’t even feel it. Also, you have to be very realistic. We have examples where they promise too much at the beginning, saying that we will deploy on day one. No, you set aside enough time for tuning, because since this is a very new capability for many customers, you need to give them time to interact with it. So don’t promise that you’ll deploy on day one. Once you implement in production, allow a few months to interact with a customer so they can find what their key performance indicators should be.

EnterpriseAI: Are we yet at a point where AI has become a commodity, or are we still seeing enterprise AI technology breakthroughs?

Goh: Both are right. The specific AI where you have good data to feed machine learning models or deep neural network models, the accuracy is quite high, to the point that people after using it for a while, trust it. And it’s quite prevalent, but some people think that it is not prevalent enough to commoditize. AI skills are like programming skills a few decades ago – they were highly sought after because very few people knew what it was, knew how to program. But after a few decades of prevalence, you now have enough people to do programming. So perhaps AI has gone that way.

EnterpriseAI: Where do you see the biggest impacts of AI in business? Are there still many things that we haven’t seen using AI that we haven’t even dreamed up yet?

Goh: Anytime that you you’re having someone make a decision, AI can be helpful and can be used as a decision support tool. Then there’s of course the question about whether you let the machine make the decision for you. In some cases, yes, in a very specific way and if the impact of a wrong decision is less significant. Treat AI as a tool like you would think automation was a tool. It’s just another way to automate. If you look back decades ago, machine learning was already being used, it was just not called machine learning. It was a technique used by people in doing statistics, analytics, applying statistics. There definitely is that overlap, where statistics overlap with machine learning, and then machine learning stretches out to deep neural networks where we reach a point where this method can work, where we essentially have enough data out there, and enough compute power out there to consume it. And therefore, to be able to get the neural network to tune itself to a point where you can actually have it make good decisions. Essentially, you are brute-forcing it with data. That’s the overlap. I say we’ve been at it for a long time, right, we’re just looking for new ways to automate.

EnterpriseAI: What interesting enterprise AI projects are you working on right now that you can share with us?

Goh: Two things are in the minds of most people now – COVID-19 vaccines, and back-to-work. These are two areas we have focused on over the last few months.

On the vaccine, clinical trials and gene expression data, with applying analytics to it. We realized that analytics, machine learning and deep neural networks can be quite useful in making predictions just based on gene expression data. Not just for clinical trials, but also to look ahead to the well-being of persons, by just looking at one sample. It requires highly-skilled analytics, machine learning and deep neural network techniques, to try and make predictions ahead of time, when you get a blood sample and genus expressed and measured from it.

The other area is back-to-work [after COVID-19 shutdowns around the nation and world]. It’s likely that the workplace is changed now. We call it the new intelligent hybrid workplace. By hybrid we mean a portion is continuing to be remote, while a portion of factory, manufacturing plant or office employees will return to their workplaces. But even on their returns – depending on companies, communities, industries and countries – there’ll be different requirements and needs.

EnterpriseAI: And AI can help with these kinds of things that we are still dealing with under COVID-19?

Goh: Yes, in certain jurisdictions, for example, if someone is ill with the coronavirus in a factory or an office, and you are required to do specialized cleaning in the area around that high-risk person. If you do not have a tool to assist you, there are companies that clean their entire factory because they’re not quite sure where that person has been. An office may have cleaned an entire floor hoping that a person didn’t go to other floors. We built an in-building tracing system with our Aruba technology, using Bluetooth Low Energy, talking to WiFi routers and access points. Immediately when you identify a particular quarter-sized Bluetooth tag that employees carry, immediately a floorplan shows up and it shows hotspots and warm spots as to where to send the cleaning services to. You’re very targeted with your cleaning. The names of the users of those tags are highly restricted for privacy.

EnterpriseAI: Let’s dive into the ethics of AI, which is a growing discussion. Do you have concerns about the ethics and policies of using AI in business?

Goh: Like many things in science and engineering, this is as much a social question as it is a technical one. I get asked this a lot by CEOs in companies. Many times, from boards of directors and CEOs, this is the first question, because it affects employees. It affects the community they serve and it affects their business. It’s more a societal question as it is a technical one, that’s what I always tell them.

And because of this, that’s the reason you don’t hear people giving you rules on this issue hard and fast. There needs to be a constant dialogue. It will vary by community, by industry, to have a dialogue and then converge on consensus. I always tell them, focus on understanding the differences between how a machine makes decisions, and how a human makes decisions. Whenever we make a decision, there is a link immediately to the emotional side, and to the generalization capability. We apply judgment.

EnterpriseAI: What do you see as the evolving relationship between HPC and AI?

Goh: Interestingly, the relationship has been there for some time, it’s just that we didn’t call it AI. Let’s take hurricane prediction, for example. In HPC, this is one of the stalwart applications for high performance computing. You put in your physics and physics simulations on a supercomputer. Next, you measure where the hurricane is forming in the ocean. You then make sure you run your simulation ahead of time faster than the hurricane that is coming at you. That’s one of the major applications of HPC, building your model out of physics, and then running the simulation based on starting that mission that you’ve measured out in the ocean.

Machine learning and AI is now used to look at the simulation early on and predict likelihood of failure. You are using history. People in weather forecasting, or climate weather forecasting, will already tell you that they’re using this technique of historical data to make predictions. And today we are just formalizing this for the other industries.

EnterpriseAI: What do you think of the emerging AI hardware landscape today, with established chip makers and some 80 startups working on AI chips and platforms for training and inference?

Goh: Through history, it’s been the same thing. In the end, there will probably be tens of these chip companies. They came up with different techniques. We’re back to the Thinking Machines, the vector machines, it’s all RISC processes and so on. There’s a proliferation of ideas of how to do this. And eventually, a few of them will stand out here and there will be a clear demarcation I believe between training and inference. Because inference needs to be low and lower energy to the point that should be the vision, that IoTs should have some inference capability. That means you need to sip energy at a very low level. We’re talking about an IoT tag, a Bluetooth Low Energy tag, with a coin battery that should last two years. Today the tag that sends out and receives the information, has very little decision-making, let alone inference-level type decision-making. In the future you want that to be an intelligent tag, too. There will be a clear demarcation between inference and training.

EnterpriseAI: In the future, where do you see AI capabilities being brought into traditional CPUs? Will they remain separate or could we see chips combining?

Goh: I think it could go one way, or it could totally go the other way and everything gets integrated. If you look at historical trends, in the old days, when we built the first high-performance computers, we had a chip for our CPU, and we had another chip on board called FPU, the floating point unit, and a board for graphics. And then over time the FPU got integrated into the CPU, and now every CPU has an FPU in it for floating point calculations. Then there were networking chips that were on the outside. Now we are starting to see networking chips incorporating into the CPU. But GPUs got so much more powerful in a very specific way.

The big question is, will the CPU go into the GPU, or will the GPU go into the CPU? I think it will be dependent on a chip company’s power and vision. But I believe integration, one way or the other – the CPU to GPU or GPU going into CPU – will be the case.

EnterpriseAI: What else should I be asking you about the future of AI as we look toward 2021?

Goh: I want to emphasize that many CEOs are keen on starting with AI. They are in phase one, where it is important to understand that data is the key to train machines with. And as such, data quality needs to be there. Quantity is important, but quality needs to be there, the trust of it, the data bias.

We focus on the fact that 80% of the time should be spent on the data even before you start on the AI project. Once you put in that effort, your analytics engine can make better use of it. If you are in phase one, that’s what I would recommend. If you are in a proof of concept state, then spend time in the workshop to discuss best practices with those who have implemented AI quite a bit. And if you’re in the advanced stage, if you know what you’re doing, especially if you’re successful, do take note that after a while with a good deployment, the accuracy of the prediction drops, so you have to continually retrain your machines. I think it is the practice that I am more focused on.


This article first appeared on sister website EnterpriseAI.news.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Royalty-free stock illustration ID: 1675260034

Solving Heterogeneous Programming Challenges with SYCL

December 8, 2021

In the first of a series of guest posts on heterogenous computing, James Reinders, who returned to Intel last year after a short "retirement," considers how SYCL will contribute to a heterogeneous future for C++. Reinde Read more…

Quantinuum Debuts Quantum-based Cryptographic Key Service – Is this Quantum Advantage?

December 7, 2021

Quantinuum – the newly-named company resulting from the merger of Honeywell’s quantum computing division and UK-based Cambridge Quantum – today launched Quantum Origin, a service to deliver “completely unpredicta Read more…

SC21 Was Unlike Any Other — Was That a Good Thing?

December 3, 2021

For a long time, the promised in-person SC21 seemed like an impossible fever dream, the assurances of a prominent physical component persisting across years of canceled conferences, including two virtual ISCs and the virtual SC20. With the advent of the Delta variant, Covid surges in St. Louis and contention over vaccine requirements... Read more…

The Green500’s Crystal Anniversary Sees MN-3 Crystallize Its Winning Streak

December 2, 2021

“This is the 30th Green500,” said Wu Feng, custodian of the Green500 list, at the list’s SC21 birds-of-a-feather session. “You could say 15 years of Green500, which makes it, I guess, the crystal anniversary.” Indeed, HPCwire marked the 15th anniversary of the Green500 – which ranks supercomputers by flops-per-watt, rather than just by flops – earlier this year with... Read more…

AWS Arm-based Graviton3 Instances Now in Preview

December 1, 2021

Three years after unveiling the first generation of its AWS Graviton chip-powered instances in 2018, Amazon Web Services announced that the third generation of the processors – the AWS Graviton3 – will power all-new Amazon Elastic Compute 2 (EC2) C7g instances that are now available in preview. Debuting at the AWS re:Invent 2021... Read more…

AWS Solution Channel

Introducing AWS HPC Connector for NICE EnginFrame

HPC customers regularly tell us about their excitement when they’re starting to use the cloud for the first time. In conversations, we always want to dig a bit deeper to find out how we can improve those initial experiences and deliver on the potential they see. Read more…

Nvidia Dominates Latest MLPerf Results but Competitors Start Speaking Up

December 1, 2021

MLCommons today released its fifth round of MLPerf training benchmark results with Nvidia GPUs again dominating. That said, a few other AI accelerator companies participated and, one of them, Graphcore, even held a separ Read more…

Royalty-free stock illustration ID: 1675260034

Solving Heterogeneous Programming Challenges with SYCL

December 8, 2021

In the first of a series of guest posts on heterogenous computing, James Reinders, who returned to Intel last year after a short "retirement," considers how SYC Read more…

Quantinuum Debuts Quantum-based Cryptographic Key Service – Is this Quantum Advantage?

December 7, 2021

Quantinuum – the newly-named company resulting from the merger of Honeywell’s quantum computing division and UK-based Cambridge Quantum – today launched Q Read more…

SC21 Was Unlike Any Other — Was That a Good Thing?

December 3, 2021

For a long time, the promised in-person SC21 seemed like an impossible fever dream, the assurances of a prominent physical component persisting across years of canceled conferences, including two virtual ISCs and the virtual SC20. With the advent of the Delta variant, Covid surges in St. Louis and contention over vaccine requirements... Read more…

The Green500’s Crystal Anniversary Sees MN-3 Crystallize Its Winning Streak

December 2, 2021

“This is the 30th Green500,” said Wu Feng, custodian of the Green500 list, at the list’s SC21 birds-of-a-feather session. “You could say 15 years of Green500, which makes it, I guess, the crystal anniversary.” Indeed, HPCwire marked the 15th anniversary of the Green500 – which ranks supercomputers by flops-per-watt, rather than just by flops – earlier this year with... Read more…

Nvidia Dominates Latest MLPerf Results but Competitors Start Speaking Up

December 1, 2021

MLCommons today released its fifth round of MLPerf training benchmark results with Nvidia GPUs again dominating. That said, a few other AI accelerator companies Read more…

At SC21, Experts Ask: Can Fast HPC Be Green?

November 30, 2021

HPC is entering a new era: exascale is (somewhat) officially here, but Moore’s law is ending. Power consumption and other sustainability concerns loom over the enormous systems and chips of this new epoch, for both cost and compliance reasons. Reconciling the need to continue the supercomputer scale-up while reducing HPC’s environmental impacts... Read more…

Raja Koduri and Satoshi Matsuoka Discuss the Future of HPC at SC21

November 29, 2021

HPCwire's Managing Editor sits down with Intel's Raja Koduri and Riken's Satoshi Matsuoka in St. Louis for an off-the-cuff conversation about their SC21 experience, what comes after exascale and why they are collaborating. Koduri, senior vice president and general manager of Intel's accelerated computing systems and graphics (AXG) group, leads the team... Read more…

Jack Dongarra on SC21, the Top500 and His Retirement Plans

November 29, 2021

HPCwire's Managing Editor sits down with Jack Dongarra, Top500 co-founder and Distinguished Professor at the University of Tennessee, during SC21 in St. Louis to discuss the 2021 Top500 list, the outlook for global exascale computing, and what exactly is going on in that Viking helmet photo. Read more…

IonQ Is First Quantum Startup to Go Public; Will It be First to Deliver Profits?

November 3, 2021

On October 1 of this year, IonQ became the first pure-play quantum computing start-up to go public. At this writing, the stock (NYSE: IONQ) was around $15 and its market capitalization was roughly $2.89 billion. Co-founder and chief scientist Chris Monroe says it was fun to have a few of the company’s roughly 100 employees travel to New York to ring the opening bell of the New York Stock... Read more…

Enter Dojo: Tesla Reveals Design for Modular Supercomputer & D1 Chip

August 20, 2021

Two months ago, Tesla revealed a massive GPU cluster that it said was “roughly the number five supercomputer in the world,” and which was just a precursor to Tesla’s real supercomputing moonshot: the long-rumored, little-detailed Dojo system. Read more…

Esperanto, Silicon in Hand, Champions the Efficiency of Its 1,092-Core RISC-V Chip

August 27, 2021

Esperanto Technologies made waves last December when it announced ET-SoC-1, a new RISC-V-based chip aimed at machine learning that packed nearly 1,100 cores onto a package small enough to fit six times over on a single PCIe card. Now, Esperanto is back, silicon in-hand and taking aim... Read more…

US Closes in on Exascale: Frontier Installation Is Underway

September 29, 2021

At the Advanced Scientific Computing Advisory Committee (ASCAC) meeting, held by Zoom this week (Sept. 29-30), it was revealed that the Frontier supercomputer is currently being installed at Oak Ridge National Laboratory in Oak Ridge, Tenn. The staff at the Oak Ridge Leadership... Read more…

AMD Launches Milan-X CPU with 3D V-Cache and Multichip Instinct MI200 GPU

November 8, 2021

At a virtual event this morning, AMD CEO Lisa Su unveiled the company’s latest and much-anticipated server products: the new Milan-X CPU, which leverages AMD’s new 3D V-Cache technology; and its new Instinct MI200 GPU, which provides up to 220 compute units across two Infinity Fabric-connected dies, delivering an astounding 47.9 peak double-precision teraflops. “We're in a high-performance computing megacycle, driven by the growing need to deploy additional compute performance... Read more…

Intel Reorgs HPC Group, Creates Two ‘Super Compute’ Groups

October 15, 2021

Following on changes made in June that moved Intel’s HPC unit out of the Data Platform Group and into the newly created Accelerated Computing Systems and Graphics (AXG) business unit, led by Raja Koduri, Intel is making further updates to the HPC group and announcing... Read more…

Killer Instinct: AMD’s Multi-Chip MI200 GPU Readies for a Major Global Debut

October 21, 2021

AMD’s next-generation supercomputer GPU is on its way – and by all appearances, it’s about to make a name for itself. The AMD Radeon Instinct MI200 GPU (a successor to the MI100) will, over the next year, begin to power three massive systems on three continents: the United States’ exascale Frontier system; the European Union’s pre-exascale LUMI system; and Australia’s petascale Setonix system. Read more…

Hot Chips: Here Come the DPUs and IPUs from Arm, Nvidia and Intel

August 25, 2021

The emergence of data processing units (DPU) and infrastructure processing units (IPU) as potentially important pieces in cloud and datacenter architectures was Read more…

Leading Solution Providers

Contributors

D-Wave Embraces Gate-Based Quantum Computing; Charts Path Forward

October 21, 2021

Earlier this month D-Wave Systems, the quantum computing pioneer that has long championed quantum annealing-based quantum computing (and sometimes taken heat fo Read more…

HPE Wins $2B GreenLake HPC-as-a-Service Deal with NSA

September 1, 2021

In the heated, oft-contentious, government IT space, HPE has won a massive $2 billion contract to provide HPC and AI services to the United States’ National Security Agency (NSA). Following on the heels of the now-canceled $10 billion JEDI contract (reissued as JWCC) and a $10 billion... Read more…

The Latest MLPerf Inference Results: Nvidia GPUs Hold Sway but Here Come CPUs and Intel

September 22, 2021

The latest round of MLPerf inference benchmark (v 1.1) results was released today and Nvidia again dominated, sweeping the top spots in the closed (apples-to-ap Read more…

Three Chinese Exascale Systems Detailed at SC21: Two Operational and One Delayed

November 24, 2021

Details about two previously rumored Chinese exascale systems came to light during last week’s SC21 proceedings. Asked about these systems during the Top500 media briefing on Monday, Nov. 15, list author and co-founder Jack Dongarra indicated he was aware of some very impressive results, but withheld comment when asked directly if he had... Read more…

Ahead of ‘Dojo,’ Tesla Reveals Its Massive Precursor Supercomputer

June 22, 2021

In spring 2019, Tesla made cryptic reference to a project called Dojo, a “super-powerful training computer” for video data processing. Then, in summer 2020, Tesla CEO Elon Musk tweeted: “Tesla is developing a [neural network] training computer... Read more…

2021 Gordon Bell Prize Goes to Exascale-Powered Quantum Supremacy Challenge

November 18, 2021

Today at the hybrid virtual/in-person SC21 conference, the organizers announced the winners of the 2021 ACM Gordon Bell Prize: a team of Chinese researchers leveraging the new exascale Sunway system to simulate quantum circuits. The Gordon Bell Prize, which comes with an award of $10,000 courtesy of HPC pioneer Gordon Bell, is awarded annually... Read more…

Quantum Computer Market Headed to $830M in 2024

September 13, 2021

What is one to make of the quantum computing market? Energized (lots of funding) but still chaotic and advancing in unpredictable ways (e.g. competing qubit tec Read more…

IBM Introduces its First Power10-based Server, the Power E1080; Targets Hybrid Cloud

September 8, 2021

IBM today introduced the Power E1080 server, its first system powered by a Power10 IBM microprocessor. The new system reinforces IBM’s emphasis on hybrid cloud markets and the new chip beefs up its inference capabilities. IBM – like other CPU makers – is hoping to make inferencing a core capability... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire