Score-Based Density Estimation from Pairwise Comparisons

📅 2025-10-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the problem of efficiently estimating high-dimensional target densities from sparse pairwise comparison data—arising in expert knowledge elicitation and human feedback learning. We propose a density estimation framework grounded in score matching and temperature scheduling: first, we prove that the gradient of the belief density is collinear with the score vector of the winner density, enabling derivation of a position-dependent analytical temperature field; second, under the Bradley–Terry model, we design a learnable temperature-field estimator and integrate it with score-scaled annealed Langevin dynamics and score-based diffusion sampling to reconstruct the target density. The method achieves high-fidelity recovery of multivariate, complex belief densities using only hundreds to thousands of pairwise comparisons. It demonstrates exceptional efficacy and robustness in the low-sample regime.

Technology Category

Application Category

📝 Abstract
We study density estimation from pairwise comparisons, motivated by expert knowledge elicitation and learning from human feedback. We relate the unobserved target density to a tempered winner density (marginal density of preferred choices), learning the winner's score via score-matching. This allows estimating the target by `de-tempering' the estimated winner density's score. We prove that the score vectors of the belief and the winner density are collinear, linked by a position-dependent tempering field. We give analytical formulas for this field and propose an estimator for it under the Bradley-Terry model. Using a diffusion model trained on tempered samples generated via score-scaled annealed Langevin dynamics, we can learn complex multivariate belief densities of simulated experts, from only hundreds to thousands of pairwise comparisons.
Problem

Research questions and friction points this paper is trying to address.

Estimating target density from pairwise preference comparisons
Learning winner score via score-matching techniques
Recovering belief densities from tempered winner distributions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Score-matching learns winner density from comparisons
Analytical tempering field links winner and target densities
Diffusion model trains on tempered samples via Langevin dynamics