Quantcast
Membership Signup
Singularity University

IBM’s ‘Watson’ Takes on Jeopardy! You Can Challenge the Computer to a Trivia Duel

IBM-watson-jeopardy-challenge

Are you ready to face a computer in Jeopardy?

Back in 1997, IBM made history when its computer Deep Blue defeated Grandmaster Garry Kasporov in a game of chess. Now their new questioning answering system, Watson, is looking to do the same with Jeopardy! Run on multimillion dollar super computers, Watson solves questions by analyzing their language and finding possible answers in millions of documents stored in its memory. It is not connected to the internet. IBM is looking to pit Watson against former Jeopardy! champions in a broadcast match sometime soon (perhaps this fall). Clive Thompson from The New York Times recently wrote an amazing article on Watson, and the Times has a special page where you can challenge the machine to a match. Good luck, you’ll need it. You can scout your competition by watching Watson in action against real human contestants in a video from IBM below.

Jeopardy! questions prove especially difficult for computers because they contain so much wordplay and twisted phrasing. Watson has advanced skills in natural language processing, and is able to parse the relevant portions of a question. IBM wants to create a computer that can understand the way humans talk. That will have big applications in the future, as we try to build virtual assistants (imagine super-smart versions of the Siri iPhone App). As we mentioned before, automation is going to leak into many fields you wouldn’t expect. A computer that can answer Jeopardy questions could cut through bureaucratic red tape, do research for lawyers, or answer questions for doctors. Watson may make its debut playing humans on Jeopardy! but no mistake – IBM is building a system with a far greater potential.

For now, however, that potential is still limited. Watson works by rapidly searching through its millions of stored documents and finding associations between words and phrases. ‘Shakespeare’ often appears with ‘Hamlet’ and ‘Midsummer Nights Dream’ but also with ‘William’ and ‘England’. Likewise, ‘pen’ is linked to ‘writing’ and ‘ink’. All these associations help Watson answer a Rhyme-Time clue like “Shakespeare’s writing instrument” as “What is Will’s quill?”

To determine the right associations, Watson makes evaluations. It finds all the possible connections between relevant words using several different algorithms and then weighs them according to how often they come up in its database. It prefers answers that are found by multiple algorithms and it double checks its answers by running them back through its system. The analysis of possibilities, probabilities, and double-checking lets Watson not only know what the answer might be, it let’s it evaluate how right it thinks it is. If it’s not confident, it doesn’t buzz in. That sort of decision making is a sign of a great Jeopardy! player and as you’ll see in the following clip, Watson has some serious trivia chops:


When you play against Watson on the New York Times site, you’re actually playing against pre-recorded guesses – so questions don’t change. That means if you lose the first time through you can go back and answer all the questions with the correct answers. While you don’t get a sense of Watson’s speed, you can get an idea of how it evaluates answers by looking at the probabilities (bar graph) it associates with each one. Pretty cool.

Of course, you’ll also notice that Watson makes some big mistakes as well. Tricks in semantics, and reasoning still trip up the machine. But that’s okay, it’s a work in progress. The DeepQA project was only begun in 2007, headed by David Ferrucci, and IBM only announced the Jeopardy! Challenge last year. Thompson does an amazing job describing the history of the project in greater detail in his NYT article. Watson still has several months, perhaps longer, before it must face a set of unknown Jeopardy! all-stars on television.

It will need the time to prepare. IBM categorizes what it takes to win at Jeopardy! into four basic skills: searching through a clue for the relevant portion, finding the correct answer in a vast realm of stored knowledge, evaluating the confidence in your answer, and quickly deciding whether or not to buzz in. Watson is good at all four skills sets, but humans excel at the later two skills. An all-star contestant will buzz-in before they are even sure of their answer, trusting in the five-second grace period to figure things out. Winners typically buzz in first for half or more of the questions, and get the answer right 85-95% of the time. Watson isn’t at that level yet.

The human trials you see in the video took place over several days. According to the NYT, Watson carried one day, able to win 4 out of 6 games against 7 human opponents. Yet the following day it lost just as many games, once with no points. It still loses to humans with rhyming clues, and wordplay, and can get distracted by word associations that appear often but are not relevant to the question. Watson will have to get better if it hopes to beat the likes of Ken Jennings.

IBM-watson-computer

Watson is powered by large and expensive servers...but costs and size are shrinking. Eventually everyone will have a system like this in their home.

Whether or not Watson finds success in its eventual Jeopardy! showdown, IBM plans on marketing similar systems to companies in the next few years. In the beginning, the list of those who could afford such a machine will be short, as Watson depends on Blue Gene servers, around $1 million each. IBM executives, however, hope that in the next ten to fifteen years price performance in computing will allow a DeepQA system to become much cheaper, eventually available on machines the size of a laptop. 2025 could be the year that everyone has a Watson in their home.

But we might be experiencing the benefits of question answering systems far sooner. We’ve already discussed how medical AI programs could help doctors in the near future (the Xprize is even aiming to put them on your smart phone). These systems won’t have Watson’s level of language analysis, but they’ll answer questions with quick (and hopefully accurate) results. One day, the techniques created for the DeepQA project will allow us to interact with such systems by talking just as we would with any human. Eventually we may not be able to tell the difference between the two experiences. Eventually we won’t might not even care if there is a difference.

[image credits: New York Times, IBM]

[source: IBM, New York Times]

Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

15 comments

  • Gerben says:

    makes me wonder, what will be the first. the quantum computer (commercially) or a cognitive learning AI, bit like the chicken and the egg thing.

  • Gerben says:

    makes me wonder, what will be the first. the quantum computer (commercially) or a cognitive learning AI, bit like the chicken and the egg thing.

  • The Avenger says:

    This event and the Pikes Peak autonomous car race is what I’m looking forward to the most the coming fall.

  • The Avenger says:

    This event and the Pikes Peak autonomous car race is what I’m looking forward to the most the coming fall.

  • Dave says:

    It’s pretty silly to think that normal people aren’t going to have access to this until 2025. That’s like thinking that normal people won’t have access to “something that will search the entire internet and bring back the most relevant result in .1 seconds”. All this will be put on the cloud, where de-facto supercomputers of this size already exist.

    They’ll have this for your iPhone by 2015.

    • Lewis says:

      I completely agree, I’m pretty sure this will be a cloud based in a few years. Even if took 10-20 seconds to get an answer, it would be way more efficient that trying to find an answer than browsing in Internet on a mobile device.

      It would be awesome if an organization started a distributed computing version of this to tap unused processing power. I want to use this now!!

      • Garrison says:

        why would u want that the world is coming to self confident with robots. if god wanted robots to do everything for us he would of made them himself. were making things that are so smart that they’re going to over power us and concer.

  • Dave says:

    It’s pretty silly to think that normal people aren’t going to have access to this until 2025. That’s like thinking that normal people won’t have access to “something that will search the entire internet and bring back the most relevant result in .1 seconds”. All this will be put on the cloud, where de-facto supercomputers of this size already exist.

    They’ll have this for your iPhone by 2015.

    • Aaron Saenz says:

      Yes! I was really conflicted when writing this ending because I wasn’t sure how to take IBM’s 10-15 year outlook. As far as purchasing power and processing vs. size that’s about the right time table for getting a single Blue Gene down to laptop size. (A very expensive laptop probably).
      But what about the cloud? Surely we’ve got tetraflops to spare out in the aether…or we will soon enough.
      The counter-argument for that is distributing the processing for such a task doesn’t make a lot of sense when everyone will want to use the same Watson-like application 24 hours a day. We can’t share processing power when we all want to use all of the power all of the time.
      OR maybe we can… I mean, what if we each only want (on average) 10 seconds of Watson-level language analysis per hour. Or some other ratio. Couldn’t we time-share the cloud in that case…
      …Obviously I’m kind of going around in circles here. Which is why I just sort of stuck to IBM’s prediction and left it at that. But Dave could be very right. As I mention in the article human-like language processing is already here in some form (Siri and such) and it’s only going to get better in the years ahead.

    • Lewis says:

      I completely agree, I’m pretty sure this will be a cloud based in a few years. Even if took 10-20 seconds to get an answer, it would be way more efficient that trying to find an answer than browsing in Internet on a mobile device.

      It would be awesome if an organization started a distributed computing version of this to tap unused processing power. I want to use this now!!

  • deleo says:

    I just wonder if IBM would ever choose to commercialize this and make the Watson homepage look like Google. It would be a big opportunity for IBM if they could give the masses access to Watson.

  • deleo says:

    I just wonder if IBM would ever choose to commercialize this and make the Watson homepage look like Google. It would be a big opportunity for IBM if they could give the masses access to Watson.

  • Josiah Raimer says:

    This article gives the light in which we can observe the reality. this is very nice one and gives in depth information.

  • Josiah Raimer says:

    This article gives the light in which we can observe the reality. this is very nice one and gives in depth information.

  • PB in CA says:

    Watson is playing a completely different game than the human players. Watson should only get microphone and camera input, and have to use a robot limb to hit the button. Situational awareness is the true mark of AI. Watson has almost none to offer. It’d be interesting to see how many crutches have been programmed into Watson just to make it through the game show protocol, i.e. does Watson decide when a new question is starting, or is that signaled to him through a coded symbol? If it’s a coded symbol that the programmers attached meaning to, then chances are good Watson doesn’t understand, but just responds mechanistically to the advent of a new question. I bet Watson doesn’t really understand the meaning of language (spoken or written), but just can search for associations between symbols he doesn’t understand the meaning of.

Singularity Hub Newsletter

Close