Building Watson: An Overview of the DeepQA Project
David
Ferrucci
IBM
T J Watson Research Center
Computer systems that can directly and accurately
answer peoples' questions over a broad domain of human knowledge have been
envisioned by scientists and writers since the advent of computers themselves.
Open domain question answering holds tremendous promise for facilitating
informed decision making over vast volumes of natural language content.
Applications in business intelligence, healthcare, customer support, enterprise
knowledge management, social computing, science and government could all
benefit from computer systems capable of deeper language understanding. The DeepQA project is aimed at exploring how advancing and
integrating Natural Language Processing (NLP), Information Retrieval (IR),
Machine Learning (ML), Knowledge Representation and Reasoning (KR&R) and
massively parallel computation can greatly advance the science and application
of automatic Question Answering. An exciting proof-point in this challenge was
developing a computer system that could successfully compete against top human players
at the Jeopardy! quiz show (www.jeopardy.com).
Attaining champion-level performance at Jeopardy! requires
a computer to rapidly and accurately answer rich open-domain questions, and to
predict its own performance on any given question. The system must deliver high
degrees of precision and confidence over a very broad range of knowledge and
natural language content with a 3-second response time. To do this, the DeepQA team advanced a broad array of NLP techniques to
find, generate, evidence and analyze many competing hypotheses over large
volumes of natural language content to build Watson (www.ibmwatson.com).
An important contributor to Watson’s success is its ability to automatically
learn and combine accurate confidences across a wide array of algorithms and
over different dimensions of evidence. Watson produced accurate confidences to
know when to “buzz in” against its competitors and how much to bet. High
precision and accurate confidence computations are critical for real business
settings where helping users focus on the right content sooner and with greater
confidence can make all the difference. The need for speed and high precision
demands a massively parallel computing platform capable of generating,
evaluating and combing 1000’s of hypotheses and their associated evidence. In
this talk, I will introduce the audience to the Jeopardy! Challenge, explain
how Watson was built on DeepQA to ultimately defeat
the two most celebrated human Jeopardy Champions of all time and I will discuss
applications of the Watson technology beyond in areas such as healthcare.