Cold-Start Personalization via Training-Free Priors from Structured World Models

📅 2026-02-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of personalization in cold-start scenarios where user historical data is unavailable. The authors propose Pep, a modular and training-free framework that decouples preference elicitation into offline structural learning and online Bayesian inference. By learning a structured world model from complete user profiles offline, Pep performs training-free Bayesian inference at runtime to dynamically select the most informative queries and predict full preferences. This approach is the first to effectively exploit the factorized structure of preference data, avoiding the policy collapse common in reinforcement learning (RL) methods that often degenerate into static questioning sequences. Experiments show that Pep achieves an 80.8% preference alignment rate—outperforming RL’s 68.5%—while reducing interaction rounds by 3–5×. It dynamically adapts follow-up questions based on user responses in 62% of cases (versus ≤28% for RL) and uses only ~10K parameters compared to RL’s 8B.

Technology Category

Application Category

📝 Abstract
Cold-start personalization requires inferring user preferences through interaction when no user-specific historical data is available. The core challenge is a routing problem: each task admits dozens of preference dimensions, yet individual users care about only a few, and which ones matter depends on who is asking. With a limited question budget, asking without structure will miss the dimensions that matter. Reinforcement learning is the natural formulation, but in multi-turn settings its terminal reward fails to exploit the factored, per-criterion structure of preference data, and in practice learned policies collapse to static question sequences that ignore user responses. We propose decomposing cold-start elicitation into offline structure learning and online Bayesian inference. Pep (Preference Elicitation with Priors) learns a structured world model of preference correlations offline from complete profiles, then performs training-free Bayesian inference online to select informative questions and predict complete preference profiles, including dimensions never asked about. The framework is modular across downstream solvers and requires only simple belief models. Across medical, mathematical, social, and commonsense reasoning, Pep achieves 80.8% alignment between generated responses and users'stated preferences versus 68.5% for RL, with 3-5x fewer interactions. When two users give different answers to the same question, Pep changes its follow-up 39-62% of the time versus 0-28% for RL. It does so with ~10K parameters versus 8B for RL, showing that the bottleneck in cold-start elicitation is the capability to exploit the factored structure of preference data.
Problem

Research questions and friction points this paper is trying to address.

cold-start personalization
preference elicitation
structured world models
preference dimensions
interactive inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

cold-start personalization
structured world models
Bayesian inference
preference elicitation
training-free priors
🔎 Similar Papers
No similar papers found.