🤖 AI Summary
This paper addresses privacy risks in learning to rank from noisy and incomplete pairwise comparison data—arising in recommendation systems, educational assessment, and similar applications. We conduct the first systematic study of differential privacy for pairwise ranking: specifically, edge-level privacy (protecting individual comparisons) and user-level privacy (protecting all comparisons contributed by a single user). We propose two private ranking algorithms: one based on perturbed maximum likelihood estimation, and another leveraging noisy counting. Both achieve minimax-optimal convergence rates under their respective privacy constraints. We establish theoretical guarantees of statistical optimality and validate the privacy–utility trade-off via extensive simulations and real-world experiments. Our core contributions are threefold: (i) a unified modeling framework accommodating both edge- and user-level privacy requirements; (ii) privacy mechanisms designed to match optimal statistical rates; and (iii) the first differentially private ranking framework that simultaneously provides rigorous theoretical guarantees and empirical effectiveness.
📝 Abstract
Data privacy is a central concern in many applications involving ranking from incomplete and noisy pairwise comparisons, such as recommendation systems, educational assessments, and opinion surveys on sensitive topics. In this work, we propose differentially private algorithms for ranking based on pairwise comparisons. Specifically, we develop and analyze ranking methods under two privacy notions: edge differential privacy, which protects the confidentiality of individual comparison outcomes, and individual differential privacy, which safeguards potentially many comparisons contributed by a single individual. Our algorithms--including a perturbed maximum likelihood estimator and a noisy count-based method--are shown to achieve minimax optimal rates of convergence under the respective privacy constraints. We further demonstrate the practical effectiveness of our methods through experiments on both simulated and real-world data.