Generalizing while preserving monotonicity in comparison-based preference learning models

📅 2025-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing comparison-based preference learning models—such as those used in LLM preference modeling—often lack monotonicity guarantees: when a user explicitly prefers item *a* over *b* (*a* ≻ *b*), the model does not necessarily assign a strictly higher score to *a* and a strictly lower score to *b*. While the generalized Bradley–Terry (GBT) model is the only known monotonic formulation, it fails to generalize to unseen items. Method: We propose a linear GBT model augmented with a graph diffusion prior, enabling both strict monotonicity and cross-item generalization for the first time. By integrating embedding-space constraints and diffusion-based regularization, we derive verifiable sufficient conditions for monotonic embeddings. Contribution/Results: Experiments demonstrate that our model significantly outperforms classical GBT and non-monotonic baselines in low-data regimes, achieving superior predictive accuracy while guaranteeing controllable monotonicity—thereby resolving the long-standing trade-off between monotonicity and generalization.

Technology Category

Application Category

📝 Abstract
If you tell a learning model that you prefer an alternative $a$ over another alternative $b$, then you probably expect the model to be monotone, that is, the valuation of $a$ increases, and that of $b$ decreases. Yet, perhaps surprisingly, many widely deployed comparison-based preference learning models, including large language models, fail to have this guarantee. Until now, the only comparison-based preference learning algorithms that were proved to be monotone are the Generalized Bradley-Terry models. Yet, these models are unable to generalize to uncompared data. In this paper, we advance the understanding of the set of models with generalization ability that are monotone. Namely, we propose a new class of Linear Generalized Bradley-Terry models with Diffusion Priors, and identify sufficient conditions on alternatives' embeddings that guarantee monotonicity. Our experiments show that this monotonicity is far from being a general guarantee, and that our new class of generalizing models improves accuracy, especially when the dataset is limited.
Problem

Research questions and friction points this paper is trying to address.

Ensuring monotonicity in preference learning models
Generalizing models to uncompared data effectively
Improving accuracy with limited dataset conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Linear Generalized Bradley-Terry models
Diffusion Priors for embeddings
Monotonicity guarantees in preference learning
🔎 Similar Papers
No similar papers found.