Design

A Fast Q&A System

By Manu Konchady, June 08, 2007

Search engines don't give answers in response to queries. Instead users depend on question/answering (Q&A) systems to scan the text of a ranked list of documents to find answers.

Question Categorization

A question category represents the type of answer—a person, place, animal, organization, time, currency, dimension, and so on. Fine-grained question classifiers may have 30 or more different categories. The accuracy of the classifier is critical because this is the first step in the processing pipeline (Figure 1). Any errors introduced in this step are propagated to the following steps and likely lead to the extraction of a wrong answer.

However, classifying questions is somewhat harder due to the limited amount of text. Pattern matching is a more accurate way to classify questions, instead of standard classification algorithms. One of the biggest clues to a question category involves the first noun following a question word. For example, in "What is the wingspan of a condor?", the noun "wingspan" following the question word "what" indicates that the answer should contain a dimension. A fine-grained categorizer uses a higher number of question categories. Some Q&A systems also use a hierarchy of question categories. The type of categorization—fine-grained or coarse-grained—is linked to the extraction of entities from the text. The likelihood of an answer is partially based on matching the question category with extracted entities; see Table 1.

Word/Phrase	Answer Entity
Who/Whose	Person
Who is	Organization, Person
What/Which	Person, Organization, Company, Place
When	Time
How many	Number
How much	Currency
How far/How long	Dimension

Table 1: Question words/phrases and associated entities.

Previous 1 2 3 4 5 6 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Design

A Fast Q&A System

Question Categorization

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Design Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Design

A Fast Q&A System

Question Categorization

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Design Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content