Rankings and partial rankings are ubiquitous in data analysis, yet there is relatively little work in the classification community that uses the typical properties of rankings. We review the broader literature that we are aware of, and identify a common building block for both prediction of rankings and clustering of rankings, which is also valid for partial rankings. This building block is the Kemeny distance, defined as the minimum number of interchanges of two adjacent elements required to transform one (partial) ranking into another. The Kemeny distance is equivalent to Kendall’sfor complete rankings, but for partial rankings it is equivalent to Emond and Mason’s extension of Tau index. For clustering, we use the flexible class of methods proposed by Ben-Israel and Iyigun, and define the disparity between a ranking and the center of cluster as the Kemeny distance. For prediction, we build a prediction tree by recursive partitioning, and define the impurity measure of the subgroups formed as the sum of all within-node Kemeny distances. The median ranking characterizes subgroups in both cases.

Clustering and prediction of rankings within a Kemeny distance framework / Heiser, W. J.; D'Ambrosio, Antonio. - (2013), pp. 19-31. [10.1007/978-3-319-00035-0-2]

Clustering and prediction of rankings within a Kemeny distance framework

D'AMBROSIO, ANTONIO
2013

Abstract

Rankings and partial rankings are ubiquitous in data analysis, yet there is relatively little work in the classification community that uses the typical properties of rankings. We review the broader literature that we are aware of, and identify a common building block for both prediction of rankings and clustering of rankings, which is also valid for partial rankings. This building block is the Kemeny distance, defined as the minimum number of interchanges of two adjacent elements required to transform one (partial) ranking into another. The Kemeny distance is equivalent to Kendall’sfor complete rankings, but for partial rankings it is equivalent to Emond and Mason’s extension of Tau index. For clustering, we use the flexible class of methods proposed by Ben-Israel and Iyigun, and define the disparity between a ranking and the center of cluster as the Kemeny distance. For prediction, we build a prediction tree by recursive partitioning, and define the impurity measure of the subgroups formed as the sum of all within-node Kemeny distances. The median ranking characterizes subgroups in both cases.
2013
9783319000343
Clustering and prediction of rankings within a Kemeny distance framework / Heiser, W. J.; D'Ambrosio, Antonio. - (2013), pp. 19-31. [10.1007/978-3-319-00035-0-2]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/538486
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 31
  • ???jsp.display-item.citation.isi??? ND
social impact