Contextual Preference Distribution Learning

📅 2026-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional approaches struggle to capture the heterogeneity and uncertainty of human preferences across diverse contexts, particularly underperforming in risk-averse decision-making scenarios. This work proposes a context-aware sequential learning and optimization framework that, for the first time, integrates parametric preference distribution modeling with risk-averse integer linear programming. By employing a bounded-variance score-function gradient estimator, the framework trains a maximum likelihood model to generate contextualized preference distribution scenarios for unseen situations. In synthetic ridesharing experiments, the proposed method reduces post-decision surprise by up to 114-fold compared to an idealized risk-neutral approach with perfect predictions, and by as much as 25-fold relative to state-of-the-art risk-averse baselines, substantially enhancing decision robustness.

Technology Category

Application Category

📝 Abstract
Decision-making problems often feature uncertainty stemming from heterogeneous and context-dependent human preferences. To address this, we propose a sequential learning-and-optimization pipeline to learn preference distributions and leverage them to solve downstream problems, for example risk-averse formulations. We focus on human choice settings that can be formulated as (integer) linear programs. In such settings, existing inverse optimization and choice modelling methods infer preferences from observed choices but typically produce point estimates or fail to capture contextual shifts, making them unsuitable for risk-averse decision-making. Using a bounded-variance score function gradient estimator, we train a predictive model mapping contextual features to a rich class of parameterizable distributions. This approach yields a maximum likelihood estimate. The model generates scenarios for unseen contexts in the subsequent optimization phase. In a synthetic ridesharing environment, our approach reduces average post-decision surprise by up to 114$\times$ compared to a risk-neutral approach with perfect predictions and up to 25$\times$ compared to leading risk-averse baselines.
Problem

Research questions and friction points this paper is trying to address.

contextual preferences
preference uncertainty
risk-averse decision-making
human choice modeling
inverse optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

contextual preference learning
distributional modeling
risk-averse optimization
score function gradient estimator
inverse optimization
🔎 Similar Papers
No similar papers found.
B
Benjamin Hudson
Mila – Quebec Artificial Intelligence Institute, Montreal, Quebec, Canada; Département d’informatique et de recherche opérationnelle (DIRO), Université de Montréal, Montreal, Quebec, Canada
Laurent Charlin
Laurent Charlin
Associate Professor, HEC Montréal & Mila, Canada CIFAR AI Chair
Machine LearningArtificial Intelligence
Emma Frejinger
Emma Frejinger
Professor, Department of Computer Science and Operations Research, Université de Montréal
Statistical LearningOperations ResearchTransportation ScienceDemand Forecasting