Watson

Last week the first of two games of Jeopardy in which the IMB supercomputer Watson faces off against human opponents was played. And, in the end, the computer asserted its dominance with a commanding victory.

Over the past three days, IBM’s custom supercomputer Watson off against against humans in two televised games of Jeopardy. The challenge of designing and programming a computer to win at Jeopardy is the successor to IMB’s Deep Blue, which faced off against chess grand master Garry Kasparov. I will admit that, watching the show, I couldn’t help but root for Ken Jennings and Brad Rutter. While I have been excited about Watson for a long time, watching the match, I realized that I have a soft spot for humans.

So, what does Watson’s victory last night mean in the greater scheme? Is this the beginning of a revolution in computing? Is this the beginning of a more literal revolution where computers rise up and take over our planet? I believe that it’s neither. While many aspects of Watson are impressive, I found myself a bit underwhelmed by what he showed. Perhaps he (it?) was too convincing, too good at what he did for me to remain impressed. As I was watching, I found myself thinking it obvious that a computer would be able to answer many of the questions asked. At the same time, I knew that my reflex-like skepticism was ignoring the difficulty in parsing human language, which is the real achievement of Watson.

The challenge that Watson represents is not about storage or search. These are the things that computers are inarguably incredibly good at, and we have all become jaded to a machine’s dominance. Retrieving a well-defined fact is simple if it is stored in a well organized database and the question is merely a query of that database. This is a trivial task in computer science.

The real challenge is deciphering what a question is looking for and what type of answer is required. In many ways, Watson is an English grammar machine. It has to break down the sentence structure of the input question (or “answer,” in Jeopardy lingo) and decide what object it’s looking for. Does a question require a date in time, a person, the name of a song? Presumably (Watson’s exact algorithms are kept a secret) the computer, once it identifies the type of thing being looked for, evaluates out of many possibilities those objects of that type that are most associated with the key words appearing in the question. Watson actually uses more than a thousand different algorithms to come up with these possible answers, and its level of confidence is derived from how many algorithms return the same answer.

Using this sort of structure, some Jeopardy questions will be inherently easier than others. Based on his performance over the last few days, it was clear that Watson was very good at questions questions where the question essentially gives one piece of key information that is unambiguously associated with the answer. If asked who composed a particular symphony, Watson would be very likely to come up with the answer. It must realize that it’s looking for a person and decide the person most associated with that symphony, which would easily be the composer.

Watson struggled more where questions gave information from two directions and were more implicit in their suggestions. One question read, “Stylish elegance, or students who all graduated in the same year.” This sort of question doesn’t have a one-to-one nature to it. Rather, it describes two different partial definitions of a word. Watson incorrectly answered “chic,” which was probably highly associated with “Stylish elegance,” but didn’t match that well with the “students who graduated” part of the question. The real answer is “class.”

Similarly, Watson failed to answer on of the Final Jeopardy questions correctly. The question, “Its largest airport was named for a World War II hero; its second largest, for a World War II battle” is extremely complicated. There aren’t really any key words that jump out. Watson, using its many algorithms, had to try to associate “city” with “World War II” and “airport.” While the exact relationship between the two airports and how they relate to World War II is obvious to a human, there’s no reason to believe that Watson was really able to understand that relationship in a meaningful way. Watson could only search those concepts and try to find the city with the closest tie to those ideas. It answered, “Toronto,” which isn’t actually a US city as the category required. The correct answer is “Chicago,” which both humans were able to answer.

Some have argued that Watson’s real advantage is it’s ability to push the answer button faster and more reliably than the humans. I think there’s a lot of merit to this argument. Watson is fed the question as soon as it appears on the screen, when Alex starts to read it. In Jeopardy, contestants can only buzz in after the host has finished reading the question. Watson is electronically told when it can start to answer a question. As far as I understand, the same signal that activates the buzzers also tells Watson that it is now allowed to buzz in. If it has calculated an answer by that point, it will use it’s lighting-fast hydraulics to buzz in almost immediately, faster than any human can. So, the way the game is set up, Watson essentially has first dibs on any question that it knows by the time Alex finishes reading. And, as any Jeopardy contestant will tell you, the buzzer is the key to success on the show.

So, how should we feel about Watson? I think we should be nothing but excited about success that Watson has shown, but especially for its future applications. Watson is really a model for an interface between a real human question and a huge dataset. And, while it’s not perfect at a game like Jeopardy, it’s pretty darn good. A Watson-like system would be much better in an environment where it’s not trying to be tricked, where the human asking the question really wants Watson to come up with the right answer. I’m think a Watson system would be particularly invaluable in the medical community. In a field with an exponentially exploding amount of research and data on medicine and health, a tool which can answer questions phrased in plain english would be extraordinarily helpful. Watson isn’t there yet, and it would have to be tailored for the particular needs of any non-Jeopardy practical application. But Watson it’s a solution, it’s a spectacle. It’s an example of what computers can do. It’s goal isn’t to be the end but rather the beginning. Watson is a small, self-contained demonstration of what future computers will be. It is pointing where research should aim and taking a major step in that direction.