Learning User Preferences for Image Generation Model

📅 2025-08-11

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work addresses the challenge of modeling fine-grained, dynamic, and heterogeneous user aesthetic preferences—spanning color, style, subject, composition, and other multi-level attributes—in personalized image generation, overcoming limitations of conventional approaches that rely on static user profiles or population-level average preferences. We propose a multimodal large language model–based framework for personalized preference learning. Our method introduces a contrastive preference loss and learnable user preference tokens to jointly model individual specificity and shared group patterns, enabling end-to-end learning of fine-grained preference representations from historical interaction data. Experiments demonstrate that our approach achieves significantly higher preference prediction accuracy than state-of-the-art baselines, enables high-fidelity clustering of aesthetically similar users, and substantially improves alignment between generated images and individual user preferences.

Technology Category

Application Category

📝 Abstract

User preference prediction requires a comprehensive and accurate understanding of individual tastes. This includes both surface-level attributes, such as color and style, and deeper content-related aspects, such as themes and composition. However, existing methods typically rely on general human preferences or assume static user profiles, often neglecting individual variability and the dynamic, multifaceted nature of personal taste. To address these limitations, we propose an approach built upon Multimodal Large Language Models, introducing contrastive preference loss and preference tokens to learn personalized user preferences from historical interactions. The contrastive preference loss is designed to effectively distinguish between user ''likes'' and ''dislikes'', while the learnable preference tokens capture shared interest representations among existing users, enabling the model to activate group-specific preferences and enhance consistency across similar users. Extensive experiments demonstrate our model outperforms other methods in preference prediction accuracy, effectively identifying users with similar aesthetic inclinations and providing more precise guidance for generating images that align with individual tastes. The project page is exttt{https://learn-user-pref.github.io/}.

Problem

Research questions and friction points this paper is trying to address.

Predicting dynamic user preferences for image generation

Capturing multifaceted individual tastes accurately

Enhancing consistency in personalized image generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal Large Language Models for preference learning

Contrastive preference loss distinguishes likes and dislikes

Learnable preference tokens capture shared user interests

🔎 Similar Papers

No similar papers found.