Beyond Kemeny Medians: Consensus Ranking Distributions Definition, Properties and Statistical Learning

📅 2026-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of effectively modeling ranking distributions over the symmetric group by proposing the Consensus Ranking Distribution (CRD) model, which approximates the target distribution via a sparse mixture of Dirac measures. The approach introduces local ranking medians and employs the Kendall τ distance as the optimal transport cost, leveraging a top-down tree-structured algorithm to iteratively refine approximation accuracy. Theoretical analysis demonstrates that the transport distortion can be precisely expressed in terms of pairwise ranking probabilities, thereby circumventing the fundamental obstacle posed by the absence of a vector space structure on the symmetric group. Experimental results validate the superior efficiency and practical utility of the proposed algorithm.

Technology Category

Application Category

📝 Abstract
In this article we develop a new method for summarizing a ranking distribution, \textit{i.e.} a probability distribution on the symmetric group $\mathfrak{S}_n$, beyond the classical theory of consensus and Kemeny medians. Based on the notion of \textit{local ranking median}, we introduce the concept of \textit{consensus ranking distribution} ($\crd$), a sparse mixture model of Dirac masses on $\mathfrak{S}_n$, in order to approximate a ranking distribution with small distortion from a mass transportation perspective. We prove that by choosing the popular Kendall $\tau$ distance as the cost function, the optimal distortion can be expressed as a function of pairwise probabilities, paving the way for the development of efficient learning methods that do not suffer from the lack of vector space structure on $\mathfrak{S}_n$. In particular, we propose a top-down tree-structured statistical algorithm that allows for the progressive refinement of a CRD based on ranking data, from the Dirac mass at a Kemeny median at the root of the tree to the empirical ranking data distribution itself at the end of the tree's exhaustive growth. In addition to the theoretical arguments developed, the relevance of the algorithm is empirically supported by various numerical experiments.
Problem

Research questions and friction points this paper is trying to address.

consensus ranking
ranking distribution
Kemeny median
symmetric group
Kendall tau distance
Innovation

Methods, ideas, or system contributions that make the work stand out.

consensus ranking distribution
local ranking median
Kendall tau distance
mass transportation
tree-structured learning
🔎 Similar Papers
No similar papers found.
S
Stephan Clémençon
LTCI, Télécom Paris, Institut Polytechnique de Paris
Ekhine Irurozki
Ekhine Irurozki
Telecom Paris
Machine LearningOptimizationRankingPermutationComputational Social Choice