‘Jeopardy!’ IBM Challenge Part 1 Recap: Robot Overlords – 1 / Pathetic Humans – 0

This week, ‘Jeopardy!’ is airing the “IBM Challenge,” which pits a new supercomputer named Watson against the show’s two greatest (human) champions. The first round concluded last night. The result…? The meatbags got spanked. Badly.

The IBM Challenge is a two-game tournament stretched out over three days, mostly so that the show can stuff in a bunch of filler promos in which IBM folks gush over their accomplishment. The entire episode on Monday only made it through the first “Jeopardy” round of play. Tuesday’s episode wrapped up that game’s “Double Jeopardy” and “Final Jeopardy” rounds. The pace will have to pick up tonight if they expect to get a whole game in.

Watson is playing against Ken Jennings (who holds the show’s record for most games won) and Brad Rutter (who has won the most money overall). The IBM spots explain that Watson is not connected to the internet, and must use a mechanical servo to press a buzzer just like the human contestants. Its (his?) brains are located in a server room off stage. It’s represented on camera by a video screen in the middle booth that displays an avatar with graphics that show it “thinking.” In a neat touch, for every question that Watson answers, we can also see what it perceives as the top three possible answers, along with its degree of confidence in each.

Watson was not flawless. It screwed up several questions, mainly to do with puns and word puzzles. However, it did surprisingly well with many other potentially tricky questions, and pretty much rocked right through any questions that had to do with straightforward facts. It was also generally faster on the buzzer than the humans.

Watson’s voice is like a softer, friendlier version of HAL from ‘2001: A Space Odyssey‘. Its pronunciation is fairly typical for an artificial speech emulator program. It repeatedly mangled the name of one category with the jokey title “Etude Brute.”

The computer’s strategy for choosing which clues it wants to answer clearly shows that it’s been programmed to hunt for Daily Doubles. It invariably started new categories towards the middle or bottom, and hop-scotched all over the game board, rather than “running” categories from beginning to end as many regular contestants are apt to do. In fact, Watson hit the “Jeopardy” round’s Daily Double on its very first at-bat, and found both of them in the “Double Jeopardy” round as well. Its decisions for how much money to risk in Daily Doubles and in Final Jeopardy were bizarre and, well… inhuman. The computer doesn’t seem to have much tolerance for risk. It could have bid a lot more money and still remained safely in the lead, but bid conservatively most times instead.

By the first game’s end, Watson slaughtered both of the human contestants with a winning total of $35,734. That compares to $10,400 for Rutter and $4,800 for Jennings – and those numbers only sound as high as they do because both men risked all they had and doubled their totals in Final Jeopardy. Even so, Watson was uncatchable by that point. Ironically, Watson got the Final Jeopardy question wrong, but only risked a small amount of money.

The humans have one more chance to foil the inevitable robot apocalypse with tonight’s game. Since this is a tournament, all scores will be cumulative across both games. Let’s hope for the best. The future of our species may depend on it!

12 comments

  1. Shayne

    I think I may be the only person not associated with IBM who is rooting for Watson. I for one welcome our robotic overlords and offer my services plugging puny humans into generators.

    • Shayne

      Also, I found it hilarious how he added all the question marks to his final Jeopardy answer, just short of an “LOL IDK!!!” But even better when he bet almost nothing, I swear you could sense him laughing at the others who were thinking they just caught a break.

    • Shayne

      That would be the case if this were Trivial Pursuit, but the language barrier is the main issue here. For a computer to understand the wording, and sort through the extraneous details to determine what is actually being asked for is pretty amazing.

        • EM

          I imagine most people reading this would have little trouble processing and correctly responding to this question:
          (3 – 1) × 3 = ?
          They might have a little more trouble if essentially the same question was asked in “story problem” format:
          Mary, Paul, and Sue help Joe pick a dozen or so apples from his apple tree. Joe feels very grateful and gives them each three apples as a reward. Sam the tax collector sees this and reminds Mary, Paul, and Sue that they each owe Sam’s organization one of the apples they received from Joe. They each hand Sam one apple, upon which he thanks them and leaves. Since it’s time for them to leave as well, Mary, Paul, and Sue decide to collect their remaining apples in a basket which Sue brought along. How many apples do they put in Sue’s basket?

          Fortunately, Jeopardy! contestants do not deal with clues quite so wordy as my latter example. In any case, computers traditionally need questions asked in a format which far more resembles my first example than my second example. Humans might prefer the first version too, but generally it is possible for them to work out the second version (perhaps converting it to a format like the first version along the way); traditionally a computer would not be able to make sense of the second version at all.

          • Ahh, I see what you mean! It’s not so much the knowledge, but the ability to recognize normal human speech and derive the question from it.

            Reminds me a bit of The Hitchhiker’s Guide to the Galaxy in that regard. Deep Thought was fully capable of coming up with an answer even though it didn’t know the question.

          • Josh Zyber
            Author

            The language itself isn’t the only obstacle, though it’s a big one. The computer must also learn how to make associations between things that don’t seem to be directly related. The human brain makes logical leaps and connections all the time, but that’s been a stumbling block for artificial intelligence.

    • Adam

      One big problem for the AI is the puns and word associations inherent in Jeopardy. They aren’t simple fact questions or even just filled with extraneous words. It’s a different type of intelligence than traditional AI.

  2. EM

    Our mechanical masters are already holding the reins, some without any sophisticated intelligence to speak of (much like traditional human overlords). Really, all our masters require is a human willingness for subservience to them, which is plentiful.

  3. Josh Zyber
    Author

    The Final Jeopardy question that Watson got wrong was in the category “U.S. Cities”:

    “Its largest airport was named for a World War II hero; its second largest, for a World War II battle.”

    The correct answer (which both humans got right) is Chicago. Watson guessed Toronto (with a low 30% confidence level). That’s not even a U.S. city.

    Here’s an article that explains how this seemingly simple, fact-based question could have tripped up the computer:

    http://asmarterplanet.com/blog/2011/02/watson-on-jeopardy-day-two-the-confusion-over-an-airport-clue.html