Goodbye Turing Test, Hello Duck Test

In 1950, Alan Turing wrote a paper for the journal, Mind, called, “Computing Machinery and Intelligence.” The first topic he discussed was “The Imitation Game.” The imitation game became known as the Turing Test. The idea was that an interrogator asked questions, in written form, to two parties, at first a man and a woman, then later a man and a machine. If, on the basis of the answers to the questions, the interrogator could not distinguish the man from the machine, then the conclusion would be that machines can “think.” In fact, in an effort to escape the ambiguity of the definitions of either “machine” or “think,” Turing chose to replace them with an observable action, which was simply, “is it impossible to distinguish responses produced by a machine from those produced by a human?”

Turing’s paper is a tour de force, worth reading even today. He goes through all the objections he can think of for either disregarding the results of such an exercise or to the basic idea that a machine would be able to produce responses that were indistinguishable from those of a human. One of the most perceptive is referred to as “Lady Lovelace’s Objection,” apparently made originally to Charles Babbage’s claim that his “analytical engine” could duplicate human thinking. According to Turing’s source, Lady Lovelace pointed out that Babbage’s machine could not produce any original behavior of its own devising, since it can only “do whatever we know how to order it to perform.” This is the 19th century version of the claim that computers and artificial intelligences “can only do what they’re programmed to do,” and Turing counters it with the observation that he is often surprised by what computing machines can do beyond what he intended them to do and that having a machine that learned would be what enabled it to originate new behaviors (which is what defeats the objection  for today’s neural network AIs).

Turing parries a host of other objections regarding why, in principle, a machine could not do what humans do, and two of his most interesting responses come under the headings, “The Argument from Consciousness,” and “Arguments from Various Disabilities.” In the case of the former, he quotes a Professor Jefferson who said, “Not until a machine can write a sonnet or compose a concerto because of thoughts and emotions felt, and not by the chance fall of symbols, could we agree that machine equals brain—that is, not only write it but know that it had written it.” Turing’s response is to say that because we cannot for sure know what is going on inside anyone else’s head, Jefferson’s objection appears to require that one needs “to be the machine and to feel oneself thinking.” Since becoming the machine is impossible, he opts for his Imitation Game as the best realistic option for deciding the question.

In the case of the “Arguments from Various Disabilities,” he says, “These arguments take the form, ‘I grant you that you can make machines do all the things you have mentioned but you will never be able to make one to do X.’”  He goes on to discuss a number of things it is claimed are the province of humans but not machines, from “falling in love” and “enjoying strawberries and cream” to being the “subject of its own thought.” While he considers each, he concludes by saying, “The criticisms that we are considering here are often disguised forms of the argument from consciousness.” I think he is correct.

Turing’s analysis shows us the direction to take in escaping from the confines of his Imitation Game, i.e., the Turing Test. In a time such as now, when professors are trying to figure out how to tell if their students’ papers were written by students or by ChatGPT, or when art exhibition judges can’t tell AI-produced art from human-produced art, and literary journal editors are hunting for methods to distinguish AI-generated stories from human-generated stories, I think it’s safe to say that the Turing Test is outmoded, or at least on its last legs, as a useful method of deciding whether machines have reached a human level of thinking. In some sense, the ball is now in the court of the humans, rather than the machines or machine developers. We need to come up with the “X” that humans can do, and machines can’t.

The ”Duck Test”—If it looks like a duck, swims like a duck, and quacks like a duck, it probably is a duck—is often designated as an example of abductive reasoning, i.e., the selection of the simplest and most likely explanation for a set of observations. Most often it is seen as a form of “justification” of a conclusion, although the originator of the idea of abductive reasoning, Charles Sanders Peirce, saw it as a method for generating hypotheses (Peirce Edition Project, 1998). We depart from the intention of the traditional duck test in the sense that we are not seeking to convince anyone that AIs are human, only that they operate like a human. Partly, that is because our definition of human includes much more than what goes on in humans’ minds, but we are only concerned with the mind of an AI. We do not demand that it look like a human, swim like a human, run like a human, have sex like a human, or bleed like a human. Like Turing (or John Searle with his Chinese Room argument) if we made our decision on the basis of comparing responses of AIs and humans, we would need to put them out of our sight, or we would immediately see the difference. But we haven’t yet determined that some sort of interaction with an interrogator or observation of behavior is the way to determine if X is present. In fact, such comparisons of responses from AIs and humans are behind much of current research, which takes known human capabilities on tasks such as doing mathematical problems, reasoning abstractly, creating images, playing games such as chess or go, answering questions (or even generating questions, as in playing Jeopardy), writing essays, poems or fiction, or conversing to show that with each new edition of the latest AI, there are fewer and fewer of these activities in which AIs and humans produced different responses.

The challenge is to come up with something that humans do, involving their minds, that AIs can’t do. Not just that they don’t do, but that no AI yet developed is able to do. A second proviso may be that the AI needs to do it in the same way that a human does it. This can be tricky. One reason to think that it’s important is that Vladimir Vapnik, when discussing the “duck test’ not as a test for an AI, but in the context of teaching a machine to correctly identity a duck, pointed out that a machine that uses deep learning to correctly classify ducks after “zillions” of training trials is not showing human-like intelligence, since humans can do just as well with minimal training. According to Vapnik, the demonstration that deep learning can produce 99.9% correct classification tells us nothing about how humans do the same thing, which includes coming up with which predicates are able to distinguish ducks in the first place. But Vapnik may be wrong. Improvements in learning algorithms,  better training procedures, or more computing power could bring AI training trials down to a number resembling what it takes for a human to learn a concept. It also seems clear to me (but not to an internationally respected expert on machine learning – so you decide who you’ll listen to), that unsupervised learning could allow an AI to generate the predicates that are most useful for distinguishing one thing from another, including ducks from non-ducks. These are empirical questions.

What is really tricky about asking that an AI do the task “the same way” a human does it is that, in many cases, we don’t know how humans do it. Unlike Vapnik, I wouldn’t say that we have “no idea” how humans do it. Neural networks and learning procedures such as reinforcement learning are modeled on at least abstract representations of the human nervous system and learning mechanisms, but they clearly do not, so far, operate at the same level as humans do. In addition, cognitive and neurocognitive psychologists have developed detailed theories of how humans learn, make decisions, and choose actions. The burden falls on their shoulders to come up with a description of how humans do something that it seems likely that AIs can’t do at all.

I’ll stop beating around the bush and say, directly, that I agree with Turing that it will all come down to the “argument from consciousness.” What will distinguish a human from an AI will come down to a human being conscious, while an AI may or may not be.  But how do we establish that a human is conscious using evidence that could form the assessment of an AI’s consciousness?  In a recent paper with the long title, “Consciousness in Artificial Intelligence: Insights from the Science of Consciousness” (Butlin et al, 2023), nineteen computer scientists, neuroscientists, cognitive psychologists, and philosophers examined scientific theories of consciousness and assembled a list of “indicators,” properties or processes that, if they were present in an artificial intelligence, might suggest that it could achieve or already had achieved consciousness. After listing 14 such indicators, they determined that some existing AIs satisfy some of these, but none satisfy all or even a majority of them. Of course, the question is, if an AI did possess all or many or even just a few crucial indicators, how would we determine that it was conscious? The authors admit that they have no behavioral criteria for determining if the machine is conscious. They call their approach, “theory-heavy,” in the sense that they are looking for whether an AI system meets “functional or architectural conditions drawn from scientific theories,” instead of whether it meets behavioral criteria for consciousness.

Philip Goff (2023) says that consciousness is a “fundamental datum,” and we know it exists because it is privately observable, not because it is postulated to exist in order to explain observable behavior. He also says it is, by its nature, necessarily private and not observable by anyone but the person who possess it. This presents a challenge. In the Imitation Game, we have an interrogator who decides whether they are communicating with a person or a computer, but if our criteria is possessing consciousness, both the person’s and the computer’s consciousness will be unobservable to the interrogator. This leaves us two options: either we devise a method for assessing the presence of consciousness that does not require direct, public observation, or we disprove Goff’s thesis and show that consciousness is publicly observable. There are avenues for pursuing both of these ends.

It is conceivable that certain relationships among electrical discharges in the brain, either in terms of patterns of discharge across topographically arrayed neurons (e.g., Baars and Geld, 2019) or in terms of degree of integration of discharges across the whole or part of the brain structure (e.g. Tononi, 2008), could be so closely correlated with reported conscious activity that we could use the presence of such relationships as evidence of consciousness. If those same relationships could be identified across electrical firings of an AI, they could be taken as evidence of conscious thought. With regard to direct observation of consciousness, most philosophical and even some neuroscientific definitions of consciousness, such as the paper by Butlin et al, use Thomas Nagel’s famous definition of “what it’s like to be a…” in Nagel’s 1974 paper, it was a bat, but that was a metaphor for a human. Of course, we know what it’s like to be a human, not just from our own experience, but from the personal accounts of other humans. We listen to conscious thought when we listen to each other speak. The mathematician puts their conscious thoughts on a blackboard as they reason out their equations (today they probably use an iPad). So, what if an AI wrote a personal account of “What it’s like to be an artificial intelligence” and told us its life story, including all of its experiences. Would it be making its consciousness observable?  Remember, the Duck Test is abductive reasoning.  It only needs to satisfy us that what it yields is the simplest and most likely explanation for an event. I would say that the simplest and most likely explanation of an AI autobiography including its thoughts about its own thoughts, that sounded reasonable, given it’s known history, would satisfy the test that the AI was conscious.

So, the time has come. Let’s jettison the Turing Test and welcome the Duck Test. The next step is to see if there is any evidence that convinces us that an AI is conscious. I don’t mean armchair, thought-experiment evidence, but genuine, controlled research findings that indicate that the AI not only talks like a human, solves problems like a human, and reasons like a human, but that it produces evidence that it is conscious that is not distinguishable from that produced by a human.

References

Baars, B. J., and Geld, N. (2019). On Consciousness: Science and Subjectivity—Updated Works on Global Workspace Theory. New York, NY: The Nautilus Press Publishing Group.

Butlin, P., Long, R., Elmoznino, E., Bengio, Y., Birch, J., Constant, A., … & VanRullen, R. (2023). Consciousness in Artificial Intelligence: Insights from the Science of Consciousness.arXiv preprint arXiv:2308.08708.

Goff. P. (2023). Why? The Purpose of the Universe. Oxford: Oxford University Press

Nagel, T. (1974). What is it like to be a bat? The Philosophical Review, 83, pp.435–450.

Peirce Edition Project.( 1998) “A Syllabus of Certain Topics of Logic” (1903), Essential Peirce Selected Philosophical Writings, v 2, 1893-1913.

Searle, J., 1980, ‘Minds, Brains, and Programs’, Behavioral and Brain Sciences, 3: 417–57

Tononi, G.  (2008). Consciousness as integrated information: A provisional manifesto. Biol. Bull. 215, 216–242.

Turing, A. M. (1950). Computing machinery and intelligence, Mind, Volume LIX, Issue 236, Pages 433–460, https://doi.org/10.1093/mind/LIX.236.433

Vladimir Vapnik: Statistical Learning an MIT Artificial Intelligence (AI) Podcast https://www.youtube.com/watch?v=STFcvzoxVw4

 

Can an AI be superintelligent, and if so, should we fear it?  Read Casey Dorman’s novel, Ezekiel’s Brain on Amazon. Available in paperback and Kindle editions

Rather listen than read? Download the audio version of Ezekiel’s Brain from Audible.

Coming soon! The second novel in the Voyages of the Delphi series: Prime Directive. Release date: November 7, 2023.

Subscribe to Casey Dorman’s Newsletter. Click HERE