When Preference Labels Fall Short: Aligning Diffusion Models from Real Data

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work addresses a key limitation in existing preference alignment methods for diffusion models, which rely on model-generated images to construct relative preference labels and often fail to provide reliable supervision when samples are of low quality or contain artifacts. To overcome this, the paper proposes a novel paradigm that eliminates the need for explicit human-annotated preference pairs. Instead, it leverages real images as references and constructs implicit preference supervision signals by contrasting them with either generated or perturbed samples. By treating real data as the source of supervision, the method avoids costly and error-prone explicit labeling while achieving performance on par with current approaches across multiple benchmarks. This demonstrates its effectiveness and practicality, paving the way for a more label-efficient, data-driven approach to model alignment.

📝 Abstract

Preference alignment aims to guide generative models by learning from comparisons between preferred and non-preferred samples. In practice, most existing approaches rely on preference pairs constructed from model-generated images. Such supervision is inherently relative and can be ambiguous when both samples exhibit artifacts or limited visual quality, making it difficult to infer what constitutes a truly desirable output. In this work, we investigate whether real data can serve as an alternative source of supervision for preference alignment. We adopt a data-centric perspective and study a curation strategy that treats real images as reference points and constructs preference signals by contrasting them with generated or perturbed samples, without requiring manually annotated preference pairs. Through empirical analysis, we show that real-data-based supervision provides effective guidance for aligning diffusion models and achieves performance comparable to existing preference-based methods. Our results suggest that real data offers a practical and complementary source of supervision for preference alignment and highlight directions of label-efficient alignment strategies. Code and models are available at https://cwyxx.github.io/RealAlign.

Problem

Research questions and friction points this paper is trying to address.

preference alignment

diffusion models

real data

supervision

generative models

Innovation

Methods, ideas, or system contributions that make the work stand out.

preference alignment

diffusion models

real data supervision