Inference-Time Personalized Alignment with a Few User Preference Queries

πŸ“… 2025-11-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the high annotation cost and reliance on textual input in personalizing generative models via preference alignment. We propose an efficient alignment method that leverages only a few pairwise response comparisons at inference time. Our core innovation is modeling user feedback as consistent, noise-free ordinal preferences and integrating optimal arm identification theory from the logistic bandits framework to enable rapid personalized selection from a fixed candidate response pool. Crucially, the method eliminates the need for users to provide textual feedbackβ€”only 3–5 binary comparisons suffice to significantly improve output alignment with individual preferences. We validate its effectiveness across diverse tasks, including text summarization and image generation. Compared to baseline methods, our approach reduces query complexity by over 70% while maintaining or even improving alignment quality.

Technology Category

Application Category

πŸ“ Abstract
We study the problem of aligning a generative model's response with a user's preferences. Recent works have proposed several different formulations for personalized alignment; however, they either require a large amount of user preference queries or require that the preference be explicitly specified as a text input. In this paper, we propose a novel inference-time personalized alignment method, UserAlign, that elicits the user's preferences with a few queries as pairwise response comparisons. In particular, UserAlign builds on the theoretical framework of best-arm identification in logistic bandits and selects a personalized response from a fixed pool of the model's generated responses. The key idea is to consider the user's feedback consistent and noise-free, and incorporate it into the theoretical framework to identify the best response quickly. Experimental results across several tasks, involving personalized text and image generation, showcase the effectiveness of UserAlign in achieving personalized alignment.
Problem

Research questions and friction points this paper is trying to address.

Aligning generative model responses with user preferences efficiently
Reducing user preference queries through pairwise response comparisons
Achieving personalized alignment without explicit text specification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Inference-time personalized alignment method
Uses few pairwise response comparison queries
Applies best-arm identification from logistic bandits
πŸ”Ž Similar Papers
No similar papers found.