This paper presents an innovative approach to the automatic evaluation of Question Answering systems. The methodology relies on the use of the Web, considered as an “oracle” containing all the information needed to check the relevance of a candidate answer with respect to a given question. The procedure is completely automatic (i.e. no human intervention is required) and it is based on the assumption that the answers’ relevance can be assessed from a purely quantitative perspective. The methodology is based on a Web search using patterns derived both from the question and from the answer. Different kinds of patterns have been identified, ranging from “lenient” (i.e. boolean combinations of single words), to “strict” patterns (i.e. whole sentences or combinations of phrases). A statistically-based algorithm has been developed which considers both the kinds of patterns used in the search and the number of documents returned from the Web. Experiments carried out on the TREC-10 corpus show that the approach achieves a high level of performance (i.e. 80\% success rate).

Towards automatic evaluation of Question/Answering systems / Magnini, B; Negri, M; Prevete, Roberto; Tanev, H.. - STAMPA. - (2002), pp. 128-134. (Intervento presentato al convegno Third International Conference on Language Resources and Evaluation (LREC 2002) tenutosi a Las Palmas, Spain nel may 2002).

Towards automatic evaluation of Question/Answering systems

PREVETE, ROBERTO;
2002

Abstract

This paper presents an innovative approach to the automatic evaluation of Question Answering systems. The methodology relies on the use of the Web, considered as an “oracle” containing all the information needed to check the relevance of a candidate answer with respect to a given question. The procedure is completely automatic (i.e. no human intervention is required) and it is based on the assumption that the answers’ relevance can be assessed from a purely quantitative perspective. The methodology is based on a Web search using patterns derived both from the question and from the answer. Different kinds of patterns have been identified, ranging from “lenient” (i.e. boolean combinations of single words), to “strict” patterns (i.e. whole sentences or combinations of phrases). A statistically-based algorithm has been developed which considers both the kinds of patterns used in the search and the number of documents returned from the Web. Experiments carried out on the TREC-10 corpus show that the approach achieves a high level of performance (i.e. 80\% success rate).
2002
2951740808
9782951740808
Towards automatic evaluation of Question/Answering systems / Magnini, B; Negri, M; Prevete, Roberto; Tanev, H.. - STAMPA. - (2002), pp. 128-134. (Intervento presentato al convegno Third International Conference on Language Resources and Evaluation (LREC 2002) tenutosi a Las Palmas, Spain nel may 2002).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/493035
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact