Clustering and prediction of rankings within a Kemeny distance framework

Heiser, W. J.; D'Ambrosio, Antonio

doi:10.1007/978-3-319-00035-0-2

Rankings and partial rankings are ubiquitous in data analysis, yet there is relatively little work in the classification community that uses the typical properties of rankings. We review the broader literature that we are aware of, and identify a common building block for both prediction of rankings and clustering of rankings, which is also valid for partial rankings. This building block is the Kemeny distance, defined as the minimum number of interchanges of two adjacent elements required to transform one (partial) ranking into another. The Kemeny distance is equivalent to Kendall’sfor complete rankings, but for partial rankings it is equivalent to Emond and Mason’s extension of Tau index. For clustering, we use the flexible class of methods proposed by Ben-Israel and Iyigun, and define the disparity between a ranking and the center of cluster as the Kemeny distance. For prediction, we build a prediction tree by recursive partitioning, and define the impurity measure of the subgroups formed as the sum of all within-node Kemeny distances. The median ranking characterizes subgroups in both cases.

Clustering and prediction of rankings within a Kemeny distance framework / Heiser, W.J., D'Ambrosio, A.. - (2013), pp. 19-31. [10.1007/978-3-319-00035-0-2]