{Answer Validation is an emerging topic in Question Answering, where open domain systems are often required to rank huge amounts of candidate answers. We present a novel approach to answer validation based on the intuition that the amount of implicit knowledge which connects an answer to a question can be estimated by exploiting the redundancy of Web information. Two techniques are considered in this paper: a statistical approach, which uses the Web to obtain a large amount of pages, and a content-based approach, which analyses text snippets retrieved by the search engine. Both the approaches do not require to download the documents. Experiments carried out on the TREC-2001 judged-answer collection show that a combination of the two approaches achieves a high level of performance (i.e. about 88% success rate). The simplicity and the efficiency of these Web-based techniques make them suitable to be used as a module in Question Answering systems.

Comparing Statistical and Content-Based Techniques for Answer Validation on the Web / B., Magnini; M., Negri; Prevete, Roberto; H., Tanev. - STAMPA. - (2002), pp. 0-0. (Intervento presentato al convegno Apprendimento Automatico e Data mining, AI*IA 2002 tenutosi a Siena, Italy nel 11 Settembre 2002).

Comparing Statistical and Content-Based Techniques for Answer Validation on the Web

PREVETE, ROBERTO;
2002

Abstract

{Answer Validation is an emerging topic in Question Answering, where open domain systems are often required to rank huge amounts of candidate answers. We present a novel approach to answer validation based on the intuition that the amount of implicit knowledge which connects an answer to a question can be estimated by exploiting the redundancy of Web information. Two techniques are considered in this paper: a statistical approach, which uses the Web to obtain a large amount of pages, and a content-based approach, which analyses text snippets retrieved by the search engine. Both the approaches do not require to download the documents. Experiments carried out on the TREC-2001 judged-answer collection show that a combination of the two approaches achieves a high level of performance (i.e. about 88% success rate). The simplicity and the efficiency of these Web-based techniques make them suitable to be used as a module in Question Answering systems.
2002
Comparing Statistical and Content-Based Techniques for Answer Validation on the Web / B., Magnini; M., Negri; Prevete, Roberto; H., Tanev. - STAMPA. - (2002), pp. 0-0. (Intervento presentato al convegno Apprendimento Automatico e Data mining, AI*IA 2002 tenutosi a Siena, Italy nel 11 Settembre 2002).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/493034
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact