Population-Proportional Preference Learning from Human Feedback: An Axiomatic Approach

📅 2025-06-05

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Traditional preference learning often exhibits bias toward majority groups when aggregating multi-annotator judgments, leading to policy representational bias. This paper proposes a demographically proportional preference learning framework: first, it infers annotator demographic distributions from pairwise comparison data; second, it designs an aggregation strategy satisfying four axioms—monotonicity, Pareto efficiency, demographic proportionality, and bounded robustness (a novel axiom introduced herein); third, it employs a soft-max relaxation optimization method that preserves Condorcet winner stability while ensuring proportional representation. The approach integrates axiomatic social choice modeling, data-driven distribution inference, and relaxed optimization. Empirical evaluation on tabular recommendation and large language model alignment tasks demonstrates significant improvements in minority-group preference representation, achieving a balanced trade-off between fairness and consensus robustness.

Technology Category

Application Category

📝 Abstract

Conventional preference learning methods often prioritize opinions held more widely when aggregating preferences from multiple evaluators. This may result in policies that are biased in favor of some types of opinions or groups. The objective of this paper is to develop a novel preference learning framework capable of aligning aggregate opinions and policies proportionally with the true population distribution of evaluator preferences. Our approach infers the feasible set of evaluator population distributions directly from pairwise comparison data. Using these estimates, the algorithm constructs a policy that satisfies foundational axioms from social choice theory, namely monotonicity and Pareto efficiency, as well as our newly-introduced axioms of population-proportional representation and population-bounded robustness. We propose a soft-max relaxation method that smoothly trade-offs population-proportional representation with the selection of the Condorcet winner (which beats all other options in pairwise comparisons). Finally, we validate the effectiveness and scalability of our approach through experiments on both tabular recommendation tasks and large-scale language model alignment.

Problem

Research questions and friction points this paper is trying to address.

Develops preference learning framework for proportional opinion alignment

Infers evaluator population distributions from pairwise comparison data

Balances population-proportional representation with Condorcet winner selection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Infers population distributions from pairwise comparisons

Satisfies social choice axioms and new representation axioms

Soft-max relaxation balances representation and Condorcet winner

🔎 Similar Papers

On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization