We investigate the problem of training probabilistic context-free grammars on the basis of a distribution defined over an infinite set of trees, by minimizing the cross-entropy. This problem can be seen as a generalization of the well-known maximum likelihood estimator on (finite) tree banks. We prove an unexpected theoretical property of grammars that are trained in this way, namely, we show that the derivational entropy of the grammar takes the same value as the cross-entropy between the input distribution and the grammar itself. We show that the result also holds for the widely applied maximum likelihood estimator on tree banks.

Cross-entropy and estimation of probabilistic context-free grammars / Corazza, Anna; G., Satta. - STAMPA. - (2006), pp. 335-342. (Intervento presentato al convegno conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics tenutosi a New York nel 4-6 giugno 2006) [10.3115/1220835.1220878].

Cross-entropy and estimation of probabilistic context-free grammars

CORAZZA, ANNA;
2006

Abstract

We investigate the problem of training probabilistic context-free grammars on the basis of a distribution defined over an infinite set of trees, by minimizing the cross-entropy. This problem can be seen as a generalization of the well-known maximum likelihood estimator on (finite) tree banks. We prove an unexpected theoretical property of grammars that are trained in this way, namely, we show that the derivational entropy of the grammar takes the same value as the cross-entropy between the input distribution and the grammar itself. We show that the result also holds for the widely applied maximum likelihood estimator on tree banks.
2006
Cross-entropy and estimation of probabilistic context-free grammars / Corazza, Anna; G., Satta. - STAMPA. - (2006), pp. 335-342. (Intervento presentato al convegno conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics tenutosi a New York nel 4-6 giugno 2006) [10.3115/1220835.1220878].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/117511
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? ND
social impact