Worries about out-of-control AI aren’t new. Many prominent figures have suggested caution when unleashing AI. One quote that keeps cropping up is (roughly) that AI will take over eventually, but if humans are lucky, it will treat us kindly as pets. The saying, or some version of it, was most often attributed to AI pioneer Marvin Minsky, who denied it. Most likely it’s apocryphal but has endured. Minsky said something rather different, dark in its own way, but also hopeful[i] (jump to end of article to see).
In any case, we’re not there yet. Thankfully. Today, tackling the mechanics of AI and taming human-injected bias are more pressing concerns. That said, an international group[ii] of researchers argued in the Journal of Artificial Intelligence Research (JAIR) last week that even in theory, it won’t be possible to control superintelligent AI.
The group, which includes researchers from the Center for Humans and Machines Max-Planck Institute (Berlin), contend in their paper:
“Superintelligence is a hypothetical agent that possesses intelligence far surpassing that of the brightest and most gifted human minds. In light of recent advances in machine intelligence, a number of scientists, philosophers and technologists have revived the discussion about the potentially catastrophic risks entailed by such an entity. In this article, we trace the origins and development of the neo-fear of superintelligence, and some of the major proposals for its containment.
“We argue that total containment is, in principle, impossible, due to fundamental limits inherent to computing itself. Assuming that a superintelligence will contain a program that includes all the programs that can be executed by a universal Turing machine on input potentially as complex as the state of the world, strict containment requires simulations of such a program, something theoretically (and practically) impossible.
Their paper – Superintelligence Cannot be Contained: Lessons from Computability Theory – is an interesting reprise of past and current thinking around how to effectively control superintelligent AI, either by putting it into a ‘box’ from which it can’t escape or by programming into the system sufficient behavior safeguards – not unlike Isaac Asimov’s much discussed Three Laws of Robotics. They spend a fair amount of time looking at work by Oxford philosopher Nick Bostrom.
One example: “Bostrom extensively discusses the weaknesses of the various mechanisms. He relies on scenarios in which, short of rendering the AI useless, well-intentioned control mechanisms can easily backfire. As an illustrative example, a superintelligence given the task of “maximizing happiness in the world,” without deviating from its goal, might find it more efficient to destroy all life on earth and create faster-computerized simulations of happy thoughts. Likewise, a superintelligence controlled via an incentive method may not trust humans to deliver the promised reward, or may worry that the human operator could fail to recognize the achievement of the set goals.”
Using computability as the central guide, and walking through a variety of algorithms and scenarios, the researchers conclude it won’t be possible to completely control superintelligent AI. They note further, “In the extreme case, the most secure method could render the AI useless, which defies the whole idea of building the AI in the first place.”
Their paper is an interesting and reasonably quick read.
Link to EurekAlert (AAAS) article: https://www.eurekalert.org/pub_releases/2021-01/mpif-csw011121.php
Link to paper: https://www.jair.org/index.php/jair/article/view/12202
[i] “Will robots inherit the earth? Yes, but they will be our children. We owe our minds to the deaths and lives of all the creatures that were ever engaged in the struggle called Evolution. Our job is to see that all this work shall not end up in meaningless waste.” – from Marvin Minsky article, Will Robots Inherit the Earth, Scientific American, October, 1994.
[ii] Authors: Manuel Alfonseca, Autonomous University of Madrid; Manuel Cebrian, Max-Planck Institute, Berlin; Antonio Fernandez Anta, IMDEA Networks Institute, Madrid; Lorenzo Coviello, UCSD (now at Google); Andres Abeliuk, University of Chile, Santiago; Iyad Rahwan, Max-Planck Institute, Berlin.