Fashionability-Enhancing Outfit Image Editing with Conditional Diffusion Models

📅 2024-12-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing fashion image editing methods struggle to automatically enhance intrinsic fashionability while preserving human structural integrity. This paper proposes a prompt-free clothing image editing framework based on conditional diffusion models. Our approach addresses this challenge through two key contributions: (1) a novel fashionability enhancement mechanism that jointly optimizes fashion-guided generation with human-structure-preserving constraints; and (2) a dual-track evaluation framework integrating OpenSkill-based probabilistic ranking with multi-expert, five-dimensional pairwise comparison annotations—enabling both robust model training and objective assessment. Experiments demonstrate statistically significant improvements over the Fashion++ baseline in quantitative fashionability metrics, while maintaining high structural fidelity and visual appeal. Comprehensive validation via expert evaluation and user studies confirms the method’s effectiveness in generating structurally coherent and stylistically sophisticated fashion imagery.

Technology Category

Application Category

📝 Abstract
Image generation in the fashion domain has predominantly focused on preserving body characteristics or following input prompts, but little attention has been paid to improving the inherent fashionability of the output images. This paper presents a novel diffusion model-based approach that generates fashion images with improved fashionability while maintaining control over key attributes. Key components of our method include: 1) fashionability enhancement, which ensures that the generated images are more fashionable than the input; 2) preservation of body characteristics, encouraging the generated images to maintain the original shape and proportions of the input; and 3) automatic fashion optimization, which does not rely on manual input or external prompts. We also employ two methods to collect training data for guidance while generating and evaluating the images. In particular, we rate outfit images using fashionability scores annotated by multiple fashion experts through OpenSkill-based and five critical aspect-based pairwise comparisons. These methods provide complementary perspectives for assessing and improving the fashionability of the generated images. The experimental results show that our approach outperforms the baseline Fashion++ in generating images with superior fashionability, demonstrating its effectiveness in producing more stylish and appealing fashion images.
Problem

Research questions and friction points this paper is trying to address.

Fashion Image Editing
Stylistic Enhancement
Automated Trend Improvement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Conditional Diffusion Model
Fashion Enhancement
Automated Fashion Editing
Q
Qice Qin
Waseda University, Tokyo, Japan
Y
Yuki Hirakawa
ZOZO Research, Tokyo, Japan
R
Ryotaro Shimizu
ZOZO Research, Tokyo, Japan
T
Takuya Furusawa
ZOZO Research, Tokyo, Japan
Edgar Simo-Serra
Edgar Simo-Serra
Waseda University
Computer GraphicsMachine LearningComputer Vision