Authors: Stephen Baker
Watson didn't hesitate. “Who is Hyde?”
“Hyde, yes,” Trebek said. “Dr. Jekyll and Mr. Hyde.” The crowd applauded. Jennings and Rutter politely joined in. This was the custom in
Jeopardy,
though such sportsmanship seemed a bit odd when standing next to a machine.
Watson didn't stop there. Beating Jennings and Rutter to the buzz, it answered clues about the Beatles' Jude, the swimmer Michael Phelps, the monster Grendel in
Beowulf,
the 1908 London Olympics, the boundaries of black holes (event horizons), Lady Madonna, and Maxwell's Silver Hammer. By the time Rutter jumped in on a clue about the Harry Potter books (“What is Voldemort?”), Watson had $5,200, far ahead of Rutter's $1,000. Jennings trailed with only $200.
It was time for a commercial break. Off camera, Trebek shook his head as he walked across the set toward Jennings and Rutter. “I can't help but wonder if Watson was sandbagging yesterday,” he said. Was the computer, like a poker player holding a royal flush, masking its strength? Rutter didn't know, but he noticed that Watson's strategy had changed. “He wasn't jumping around the way he is today,” he said.
“He's a hustler,” Jennings said.
In fact, before the match technicians had switched Watson to its “championship” mode. This involved two changes. First, this exhibition match was a double game. The player with the highest cumulative score in the two games would win. This changed the players' strategy. Instead of following the safest path to win each game, if only by a single dollar, players had to pile up winnings. In addition to adjusting Watson's betting algorithms for double games, the IBM team directed the machine to hunt for Daily Doubles. The practice rounds, they said, were to test the machinery and the buzzer. The goal in the match was to win. These tweaks hadn't much affected Watson's scoring in this early round. The computer had simply chanced on comfortable clues, from the Beatles to black holes. That could change.
And it did. As the opening game progressed, Watson faltered. In the Final Frontiers category, it buzzed confidently on a Latin term for end, “a place where trains can also originate.” But the machine picked the wrong Latin word: “What is finis?” Jennings got “terminus” on the rebound, and inched closer.
Then Watson fell into a couple of cognitive traps. The $1,000 clue under Olympic Oddities asked about “the anatomical oddity of U.S. gymnast George Eyser, who won a gold medal on the parallel bars in 1904.” Jennings won the buzz and after a pause ventured: “What is . . . he was missing a hand?” That was incorrect. Watson buzzed on the rebound.
“What is leg?” it said.
“Yes,” Trebek said. But before they moved to the next clue, a judge called a halt to the game. Eyser's “leg” wasn't the anatomical oddity. Instead, it was the fact that he was
missing
a leg. After five minutes of consultation onstage with the judges, Trebek, and IBM's David SheplerâWatson's advocateâthe computer's response was ruled incorrect. “It was my boo-boo,” Trebek told the audience. Then he redubbed his response to Watson: “No, I'm sorry I can't accept that. I needed you to say, “What is âHe's missing a leg'?”
Watson's mistake, though subtle, reflected its misreading of the lexical answer type (LAT) in the clue. Despite years of training from James Fan and others, in this example it failed to understand precisely what it was seeking. For a national audience initially wowed by the
Jeopardy
computer, it would serve as a reminder that the machine, for all its prodigious powers, could succumb to confusion. For Jennings and Rutter, the upshot was simpler. It chopped $2,000 from Watson's lead.
This was a misstep for Watson but hardly an embarrassment. That would come later, on a $1,000 clue asking about the decade that gave birth to Oreo cookies and the first modern crossword puzzle. Jennings won the buzz and answered, “What are the twenties?' This was wrong. The deaf Watson won the rebound and promptly repeated the same wrong answer. The machine, for all its brilliance, was in many aspects oblivious. This was no secret in IBM's War Room, but now the whole world could see it.
As this first
Jeopardy
round came to a close, Rutter climbed and Watson tumbled. They ended in a tie, the co-leaders at $5,000, with Jennings at $2,000. That would end the first of the three-day television event in February, meaning that viewers would tune in for Day Two fully expecting to see a Double Jeopardy round featuring men and machine in a tense, closely fought tussle.
Watson, it turned out, had other ideas. After an intermission, in which the host and the human contestants changed clothes, Trebek unveiled the categories for Double Jeopardy. This round, which offered more background information on Watson, would occupy the second of the half-hour television shows. The names on the board gave Jennings and Rutter room for hope. A couple of them, Hedgehog Podge and Etude Brute, sounded confusingâpotential Watson train wrecks. The othersâDon't Worry About It, The Art of the Steal, Cambridge, and Church & Stateâlooked more straightforward. But they wouldn't know for sure until they started to play.
It didn't take long to see that Watson was in a groove. The machine monopolized the buzzer, hunted down the Daily Doubles, and appeared to understand every clue. Jennings, whose lectern was right next to Watson's bionic hand, later said that its staccato rhythm as it pressed the buzzer three times reminded him of “the soundtrack from
The Terminator
.” Rutter said that playing against Watson filled him with a new type of empathy. “I thought, âThis must be what it feels like to play against Ken or me,'” he said.
Watson's buzzer speed also affected the humans' game. They felt compelled to jump faster than usual for the buzzer. This often led to quarter-second penalties for early buzzingâa trap Watson never fell into. And in their eagerness to win control of the board, they found themselves hurrying to respond to clues, sometimes before reading them, resulting in mistakes. “Against human players, you have a window,” Jennings said. “Against Watson, that window essentially does not exist.”
In the first minutes of the game, Watson ransacked the board for Daily Doubles. This led it through the high-dollar clues on everything from Sergei Rachmaninoff and Franz Liszt to leprosy and albinism. The frustrated humans kept trying to buzz, to no effect. The computer nearly tripled Rutter's score, to $14,600, and then, under Cambridge, landed on the board's first Daily Double. “I'll wager six thousand four hundred thirty-five dollars,” Watson said. This figure, so unusually precise, drew laughter from the crowd. Like everything else on the board, the clue turned out to be friendly to Watson. “The chapels at Pembroke and Emmanuel Colleges were designed by this architect.” Watson could have handled this one in its infancy. The clue featured simple syntax and a crystal-clear LATâan architectâconnected to easily searchable proper nouns. By answering “Who is Sir Christopher Wren?” Watson raised its winnings to $21,035.
Two questions later, a clue appeared in the wrong box. These glitches, which would continue through the afternoon, made life even harder for Jennings and Rutter. They had to stand at the podiums with their backs turned to the
Jeopardy
board so that they wouldn't see a clue if one happened to pop up. These delays often lasted for five or ten minutes at a time. While the contestants stood there, attendants mopping their foreheads or offering them water, Trebek worked to keep the audience engaged. He told jokes and answered questions about
Jeopardy
. He mentioned, for example, that Merv Griffin, the game's founder, raked in an astounding $83 million during his lifetime for rights to his
Jeopardy
jingle. One time, as technicians labored behind him, Trebek intoned: “We realize that if we keep you waiting here three hours on the tarmac, we have to provide you with a meal, and perhaps accommodation.”
The malfunction during Watson's runaway game arrived at a strange moment. Watson had chosen the $1,600 clue under Hedgehog Podge. The clue seemed almost designed for the computer: “Garry Kasparov wrote the foreword for
The Complete Hedgehog,
about a defense in this game.” Watson, as usual, won the buzz. Its answer panel showed 96 percent confidence in its first response: “What is chess?” It was Watson's digital role model, Deep Blue, that had beaten Kasparov in the famous man-machine match in 1997. Yet as Trebek waited for a response, saying, “Watson?” the computer said nothing. After its time ran out, Jennings scored on the rebound. “Chess is right,” Trebek said. “And I think Deep Blue will never forgive Watson for missing that one.”
It turned out, though, that when the clue had popped up in the wrong box on the board, it disoriented the machine, leading Watson to keep mum. Eventually,
Jeopardy
had to replace that clue with another oneâmuch to the IBM crowd's regret. It would have been nice, after all, to have a reference to Deep Blue in the match. But in an afternoon full of technical mishaps, the chess clue fell out. “There's a line Watson's familiar with,” Trebek told the audience off camera. He made a sweeping gesture with his arm and said, “_____ happens.”
As this second half of the first game neared its end, Watson continued its rampage, ending with $36,681. Rutter and Jennings had barely inched ahead, to $5,400 and $2,200, respectively. Their best hope was that the machine, known to be weaker in Final Jeopardy, would bet heavilyâlooking for a knockout punchâand miss. The category was U.S. Cities. The clue: “Its largest airport is named for a World War II hero, its second largest for a World War II battle.”
To many, this sounded like an easy one for Watson. It was a city big enough to have two airports, each of them connected thematically to the Second World War. But Watson, assuming it understood the clue, had to carry out separate searches for many of the airports in the country, looking for connections to long lists of heroes and battles. Numerous names overlapped. New York's biggest airport, for example, was named for John F. Kennedy, who happened to be a hero of World War II. Its second airport was La Guardia. Was there a battle in the Italian campaign by that name? No doubt Watson burrowed through thousands of documents, finding along the way “battles” involving New York City's feisty mayor, Fiorello La Guardia. In the end, the computer was bewildered.
Jennings and Rutter both responded correctly: “What is Chicago?” (The bigger airport took its name from Butch O'Hare, a fighter pilot; the smaller one from the Battle of Midway.) Jennings doubled his meager winnings, to $4,400. Rutter added $5,000 to his, reaching $10,400. When Watson missed the clue, the gap promised to narrow. Its response, which drew laughter from the crowd, was: “What is Toronto??????” (The IBM team had programmed the machine to add those question marks on wild guesses so that the spectators would see that the computer had low confidence. Its awareness of what it
didn't
know was an important aspect of its intelligence.) Fortunately for Watson, it had wagered a mere $947 on its answer. It had established a big lead and was programmed to hold on to it. Even after the airport flub, it headed into the second and deciding game with a $25,000 advantage over Rutter and a bit more than $30,000 ahead of Jennings.
In the break between the two games, the crowd emptied into the lobby for refreshments. IBM's Sam Palmisano greeted Charles Lickel, the recently retired manager whose visit to a Fishkill restaurant at the height of Ken Jennings's winning streak led to the idea for the
Jeopardy
challenge. Palmisano was thrilled with Watson's performance. But was it too much of a good thing? Would Watson come off as a bully or make the show boring? “Maybe we should have dialed it down a little,” he said to Lickel.
Nearby, Ferrucci was huddled with John Kelly, the director of IBM Research. He was explaining to Kelly how the machine could possibly have picked Toronto as a U.S. city with World War IIâthemed airports. He noted that Watson had very low confidence in Toronto and that its second choice, just a hair behind, was Chicago. Watson, he said, was programmed not to discount answers based on one apparent contradiction. After all, there could be towns named Toronto in the United States. And from Watson's perspective, Toronto, Ontario, had numerous U.S. connections. For instance, its baseball team, the Blue Jays, was in the American League.
As the second and final game began, Trebek, who was born in Canada, had a little fun at Watson's expense. The three things he had learned in the previous match, he said, were that Watson was fast and capable of some weird wagersâand that “Toronto is now a U.S. city!”
The challenge for Jennings and Rutter was clear. To catch up with Watson, one of them had to rack up earnings quickly and then land on two or three Daily Doubles, betting the farm each time. That was the only way to reach sky-high scores in the remaining game. To catch Watson, one of them would probably need to reach $50,000, or even higher.
Watson promptly took off on a Daily Double hunt. It answered clues about Istanbul and the European Parliament, and identified Arabic as the mother tongue of Maltese. But it lost $1,000 by naming Serbia, instead of Slovenia, as the one former Yugoslav republic in the European Union.
It was then that Rutter and Jennings happened on a weak category for Watson: Actors Who Direct. The clues were simply the names of movies, such as
A Bronx Tale
or
Into the Wild.
The contestants had to come up with the directors' namesâRobert De Niro and Sean Penn, in those examples. Watson was slow to the buzzer in this category because the clues were so short. It took Trebek only a second or so to read them, and Watson required at least two seconds to find the answer. Jennings worked his way up the category. But when he reached the lower-dollar clues, he switched columns. The reasoning was simple. While he was safe from Watson in the category, he might lose the buzz to Rutter, who would then be in a position to win one of the Daily Doubles.