Quantcast
Membership Signup
Singularity University

How Will We Know When Computers Can Think for Themselves?

alan-turing 1Headlines recently exploded with news that a computer program called Eugene Goostman had become the first to pass the Turing test, a method devised by computing pioneer Alan Turing to objectively prove a computer can think.

The program fooled 33% of 30 judges into thinking it was a 13-year-old Ukrainian boy in a five-minute conversation. How impressive is the result? In a very brief encounter, judges interacted with a program that could be forgiven for not knowing much or speaking very eloquently—in the grand scheme, it’s a fairly low bar.

Chat programs like Eugene Goostman have existed since the 1970s. Though they have advanced over the years, none yet represents the revolutionary step in AI implied by the Turing test. So, if the Eugene Goostman program isn’t exemplary of a radical leap forward, what would constitute such a leap, and how will we know when it happens?

To explore that question, it’s worth looking at what the Turing test actually is and what it’s meant to measure.

In a 1950 paper, “Computing Machinery and Intelligence,” Alan Turing set out to discover how we might answer the question, “Can machines think?” Turing believed the answer would devolve into a semantic debate over the definitions of the words “machine” and “think.” He suggested what he hoped was a more objective test to replace the question.

teleprinter-turing-test 2

A Teletype Model 33 ASR teleprinter, with punched tape reader and punch, usable as a computer terminal.

Turing called it the imitation test. The test involved three participants, an interrogator (of either sex) and a male and female subject. The interrogator would try to discover which was male and which female by asking questions. The man would try to fool the interrogator and the woman would try to help him. To avoid revealing themselves by physical traits, the subjects and interrogator would ideally communicate by teletype from separate rooms.

Now, Turing said, substitute the participant trying to fool the interrogator with a computer. And instead of trying to discover which is a man and which a woman—have the interrogator decide which is human and which a computer.

Turing suggested this test would replace the subjective question, “Can a machine think?” and, later in the paper, suggested how well a computer might play the imitation game at the turn of the 21st century.

“I believe that in about fifty years’ time it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning. The original question, ‘Can machines think?’ I believe to be too meaningless to deserve discussion.”

Fooling 30% of the judges after five minutes, then, was Turing’s forecast of computing’s progress by 2000, not concrete criteria for passing his test.

Hal 9000, the infamously intelligent computer 2001: A Space Odyssey.

Further, as Ray Kurzweil and Mitchell Kapor note in their Long Now Turing test wager,Turing’s imitation test was “specifically non-specific.”

His paper suggests the general framework for an effective test to objectively measure machine intelligence but left the details to evolve as appropriate over later decades.

Sadly, Turing died of cyanide poisoning in 1954 at age 42 (an apparent suicide) two years after being convicted of homosexuality, then illegal in the UK, and forced to decide between prison and chemical treatment for his “condition.”

Turing’s contributions were monumental beyond his musings on machine intelligence. However, his imitation test has endured, evolved, and over the years, become widely associated with the objective measurement of advanced AI.

There are a number of variations on the Turing test—variables include the total number of judges in the test, the length of interviews, and the desired bar for a pass (or percent of judges fooled). The tests involve a judge who conducts text interviews (usually by instant message or something similar) with a number of human subjects and a computer.

The goal is still to unmask the computer, and the broad aim of the tests is to show machines have attained mental capability indistinguishable from human beings.

Now, in some areas, computers have already met and surpassed human ability: Deep Blue in chess or Watson at Jeopardy. Computation on silicon is orders of magnitude faster than computation in the brain. Computers excel at brute force number crunching, simulation, and remembering and accessing huge amounts of data.

IBM's Watson software beat two human champions at Jeopardy in 2011.

IBM’s Watson software beat two human champions at Jeopardy in 2011.

However, computers don’t have the brain’s aptitude for pattern recognition, adaptation, and the traits associated with them like language, learning, and creativity. These are some of the abilities the Turing test sets out to measure.

But to appear human, a program must also slow its responses, fabricate factual and typographical errors, inject emotional cues (positive and negative) and non-sequiturs. And this is curious. To prove intelligence, why do we require a machine mimic humans in all our strengths and failings—intelligence and ineptitude?

In his paper, Turing mounts a spirited defense against would-be opponents of machine intelligence. But I think the answer to why the Turing test requires a machine become indistinguishable from a human lies in his defense from consciousness in which he quotes British neurologist and neurosurgeon Geoffrey Jefferson:

“Not until a machine can write a sonnet or compose a concerto because of thoughts and emotions felt, and not by the chance fall of symbols, could we agree that machine equals brain—that is, not only write it but know that it had written it.”

In Jefferson’s view, a machine, through clever artificial means, may contrive to create and report its creation—but it can’t know it’s created because it’s no more than a collection of mechanical parts and instructions written by its programmers.

Turing takes Jefferson’s point and applies it to humans too, “According to the most extreme form of this view the only way by which one could be sure that machine thinks is to be the machine and to feel oneself thinking…likewise according to this view the only way to know that a man thinks is to be that particular man.”

scarlet_johansson_ai_herAnd this is, I think, at the heart of what the Turing test can show.

We can’t prove “a machine thinks” any more than we can prove the person next to us thinks. But when one is indistinguishable from the other, then we are allowed to question whether a machine can think only as much as we are allowed to question whether a human can think—and beyond that point, the question can be resolved no further.

Each bar, say fooling 30% or 50% of the judges, should be viewed less as a definitive proof of anything and more as an indicator of progress.

The ultimate Turing test, in my view, won’t be in a controlled environment but out in the real world. A future scenario in which thousands or millions of ordinary people freely interact with a sufficiently advanced program, like Samantha from the movie Her, and spontaneously begin to treat it like a human companion in nearly every sense.

Image Credit: Christopher Brown/Flickr (slate sculpture of Alan Turing by Stephen Kettle); Arnold Reinhold/Wikimedia Commons; Alberto Racatumba/FlickrIBM; 2Top/Flickr

Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

9 comments

  • Max Friedenberg says:

    I think first it’s important to ask how to prove or disprove if a human is thinking and go from there.

  • ega says:

    “We can’t prove “a machine thinks” any more than we can prove the person next to us thinks”

    If we were not able to think, how could we come up with a word describing the concept?

    If we put machines together, with no prior knowledge, and they invent words for “thinking” and “consciousness” I would assume they have, by introspection, observed it.

  • cabhanlistis says:

    QUOTE
    However, computers don’t have the brain’s aptitude for pattern recognition, adaptation, and the traits associated with them like language, learning, and creativity. These are some of the abilities the Turing test sets out to measure.
    END QUOTE
    -Uh, what? I don’t see how it measures anything. All the Turing test does is determine if a machine fools a human judge into thinking it is human, and only by comparison with another agent. Personally, I’ve never had to go through any of that to prove anything about my mind.

    But to answer the original question, how will we know if a computer is thinking for itself, I would suggest following the results of its work. Determining whether it actually thinks (at least in and of the nature of human intelligence) is either impossible or pointless. What matters is what it accomplishes for itself and others. This chatterbot, Eugene Goostman, can brag about fooling a handful of people into thinking that it’s human. I’ll throw some confetti some day if and when I decide to care about that “feat.”

    But when a computer manages on its own and per its own direction to cure some awful disease or solve a perplexing science mystery, then I will be amazed and celebrate.

    • CAgamefowl says:

      I think the Turing test can (and does) mesaure all those things, especially language and creativity. The machine would have to speak a language (to communicate with judges) and it needs to be creative enough to fool the judges. I heard of machines using humor to fool the judges. This may be a test of the programmer/designer creativity, except in the case where the machine actually fabricates jokes based on its own knowledge/experiences (learning).
      I like that Turing made the test very open and adapatable. Right now the Turing test is only testing rudimentary machines (most of us can assume we are in the infant/embryo phase of AI), but as this technology improves it will increasingly test and measure consciousness and intelligence. It may even put into question our own intelligence, and whether or not we are the rudimentary (electrochemical) machines. Once a machine reaches human itelligence we will have to raise the bar, and hopefully we will be smart enough to realize it, once it has surpassed us. And at the point, how will we know if the machine isn’t testing us? ahhh spooky :)

      • cabhanlistis says:

        Hi, CAgamefowl.

        The Turing test only measures whether an AI can pass itself off as human under a strict condition. There is no language measurement. No linguist or educator sat down with these chatbots and conducted any Hillegas or Harvard-Newton batteries, no grading of English composition, no original narrative contents, and so on. The only thing the test did was either succeed or fail in the goal of convincing judges that it’s human. That’s all. Ditto for creativity. One could pry open the source code and analyze the communication content, but that’s not part of the test and I know of no methodology for producing measurements of that content. Otherwise, I would assure you that this chatbot could never get past day one in a grade school English class.

        “using humor to fool the judges”
        -Including routines to return a pre-written joke is no different than any other part of its instructions. One would also have to throw some original jokes right back at it to even begin to gauge its ability to handle humor.

        “can assume we are in the infant/embryo phase of AI”
        -Since no one can demonstrate the maximum range of AI possibilities, this is not an assumption we can support. In comparison to human intelligence, the most advanced AI is about as good as a 4-year-old child in some areas, an 8-year-old in others, while completely brain-dead in still others, such as open-ended questions. This is the point where researchers are stuck.

        “will increasingly test and measure consciousness”
        -So far as neurologists seem to understand, the only measurement for consciousness is either on or off. There are theoretical stages, but that’s entirely biological. I don’t understand the point of emulating that in an AI outside of a Turing test. Heck, even in a Turing test, there isn’t a point to that.

        The rest of your reply is assumptive and speculative.

        • CAgamefowl says:

          Thanks for the reply, but I think you missed the point of my message. I like the Turing test because it is like a real life application. Like you said you in original post, the machine will only be significant if it accomplishes something for itself or others, and in this case it’s accomplishing a Turing test. You are right, it doesn’t specifically measure/grade language or creativity with quantitative values, but the machine does require those things to do well on the test. And as of now it probably can’t brag or realize that it fooled a judge, but once it reaches 70% success it will probably be at that level. (Although, I suspect it will only ever reach 50% because once it becomes “conscious”, the judges will be guessing with a 50/50 chance).
          Like I said earlier the Turing test is open ended and could soon be adapted to test robot/machines that look alive too. If this machine looks and acts like a living being, who’s to say it’s not alive? If something is able to fake something nearly perfectly, at what point is it not fake? I fake being an Engineer all day long and I get paid for it :)
          Regarding consciousness, it’s a broad term and vague. I don’t think it’s either on or off. I have seen varying degrees of consciousness. I suspected that when my daughter was 4 she was conscious, but not at 3 years old; same with dogs and other non-human intelligences. I am not conscious when I am asleep. I don’t even know if I am conscious during most of the day, or if I am just following pre-programmed electrochemical signals. The only time I am sure I am conscious is when I think about consciousness.
          Yes some of my message was assumptive and speculative but that was more for humor and to provoke thought. You did some of that yourself, but I won’t call you out on it.

          • cabhanlistis says:

            “You did some of that yourself, but I won’t call you out on it.”
            -Where? I went back over my comments and I don’t see it.

            “it doesn’t specifically measure/grade language or creativity with quantitative values, but the machine does require those things to do well on the test”
            -No, it only requires to pass itself off as having them, even without actually having them. Fooling some guy into buying fool’s gold doesn’t mean that rock has any gold in it. Likewise, if you program enough answers (which is a lot) that look like they have creativity and a decent command of language, doesn’t mean it is creative and has language skills. Run through it long enough and anyone can see that it’s just following a pattern. That’s why the Turing test has been limited to a few minutes for each interview.

            “If this machine looks and acts like a living being, who’s to say it’s not alive?”
            -Biologists, since they’re the experts on living organisms. But if you’re diving into a more philosophical approach, then I doubt anyone will ever produce a reliable answer. Unless a strong AI manages to solve that one for us. In which case your question would be answered.

            “If something is able to fake something nearly perfectly, at what point is it not fake?”
            -From the onset. As convincing as fool’s gold might be, say a nearly perfectly indistinguishable sample, that does not mean that at any point it has any gold content at all.

            “Regarding consciousness, it’s a broad term and vague.”
            -But you stated “will increasingly test and measure consciousness”. How can you state for an AI? Are you suggesting that we could use the same tests researchers and doctors use for humans?

  • Phil G says:

    I believe that’s the correct interpretation–the point isn’t that the specific test demonstrates intelligence; the point is to say that we should test behavior, not make philosophical arguments.

Singularity Hub Newsletter

Close