We report about the organization of the IDIAL (Evaluation of Italian DIA-Logue systems) task at EVALITA 2018, the first shared task aiming at assessing interactive characteristics of conversational agents for the Italian language. In this perspective, IDIAL considers a dialogue system as a" black box"(ie, evaluation can not access internal components of the system), and measures the system performance on three dimensions: task completion, effectiveness of the dialogue and user satisfaction. We describe the IDIAL evaluation protocol, and show how it has been applied to the three participating systems. Finally, we briefly discuss current limitations and future improvements of the IDIAL methodology.

Overview of the evalita 2018 evaluation of italian dialogue systems (idial) task / Cutugno, Francesco; DI MARO, Maria; Falcone, Sara; Guerini, Marco; Magnini, Bernardo; Origlia, Antonio. - 12:(2018). (Intervento presentato al convegno Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA)).

Overview of the evalita 2018 evaluation of italian dialogue systems (idial) task

Cutugno Francesco;Di Maro Maria;Origlia Antonio
2018

Abstract

We report about the organization of the IDIAL (Evaluation of Italian DIA-Logue systems) task at EVALITA 2018, the first shared task aiming at assessing interactive characteristics of conversational agents for the Italian language. In this perspective, IDIAL considers a dialogue system as a" black box"(ie, evaluation can not access internal components of the system), and measures the system performance on three dimensions: task completion, effectiveness of the dialogue and user satisfaction. We describe the IDIAL evaluation protocol, and show how it has been applied to the three participating systems. Finally, we briefly discuss current limitations and future improvements of the IDIAL methodology.
2018
Overview of the evalita 2018 evaluation of italian dialogue systems (idial) task / Cutugno, Francesco; DI MARO, Maria; Falcone, Sara; Guerini, Marco; Magnini, Bernardo; Origlia, Antonio. - 12:(2018). (Intervento presentato al convegno Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA)).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/962993
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact