FashionDPO:Fine-tune Fashion Outfit Generation Model using Direct Preference Optimization

📅 2025-04-17

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Existing personalized fashion outfit generation models suffer from limited generative diversity and heavy reliance on supervised learning. To address these limitations, this paper proposes the first reward-free fine-tuning framework for fashion generation based on Direct Preference Optimization (DPO). Our approach eliminates handcrafted reward functions and instead introduces an automated multi-expert feedback mechanism that jointly evaluates three critical dimensions—outfit quality, compatibility, and personalization—to guide optimization of pre-trained diffusion or autoregressive outfit generation models. We conduct experiments on the iFashion and Polyvore-U benchmarks, demonstrating substantial improvements in personalized alignment, outfit coherence, and generative diversity after DPO-based fine-tuning. All code and trained models are publicly released to foster reproducibility and further research.

Technology Category

Application Category

📝 Abstract

Personalized outfit generation aims to construct a set of compatible and personalized fashion items as an outfit. Recently, generative AI models have received widespread attention, as they can generate fashion items for users to complete an incomplete outfit or create a complete outfit. However, they have limitations in terms of lacking diversity and relying on the supervised learning paradigm. Recognizing this gap, we propose a novel framework FashionDPO, which fine-tunes the fashion outfit generation model using direct preference optimization. This framework aims to provide a general fine-tuning approach to fashion generative models, refining a pre-trained fashion outfit generation model using automatically generated feedback, without the need to design a task-specific reward function. To make sure that the feedback is comprehensive and objective, we design a multi-expert feedback generation module which covers three evaluation perspectives, ie quality, compatibility and personalization. Experiments on two established datasets, ie iFashion and Polyvore-U, demonstrate the effectiveness of our framework in enhancing the model's ability to align with users' personalized preferences while adhering to fashion compatibility principles. Our code and model checkpoints are available at https://github.com/Yzcreator/FashionDPO.

Problem

Research questions and friction points this paper is trying to address.

Enhance diversity in personalized fashion outfit generation

Overcome limitations of supervised learning in generative AI

Align outfit generation with user preferences and compatibility

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tunes model using direct preference optimization

Employs multi-expert feedback for comprehensive evaluation

Enhances outfit diversity and personalization without reward functions

🔎 Similar Papers

No similar papers found.