Overview of the evalita 2018 evaluation of italian dialogue systems (idial) task

Cutugno, Francesco; Di Maro, Maria; Sara, Falcone; Marco, Guerini; Bernardo, Magnini; Origlia, Antonio

We report about the organization of the IDIAL (Evaluation of Italian DIA-Logue systems) task at EVALITA 2018, the first shared task aiming at assessing interactive characteristics of conversational agents for the Italian language. In this perspective, IDIAL considers a dialogue system as a" black box"(ie, evaluation can not access internal components of the system), and measures the system performance on three dimensions: task completion, effectiveness of the dialogue and user satisfaction. We describe the IDIAL evaluation protocol, and show how it has been applied to the three participating systems. Finally, we briefly discuss current limitations and future improvements of the IDIAL methodology.

Overview of the evalita 2018 evaluation of italian dialogue systems (idial) task / Cutugno, F., Di Maro, M., Falcone, S., Guerini, M., Magnini, B., Origlia, A.. - 12:(2018). (Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA) ).