Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models

📅 2025-05-16

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Existing diffusion models trained on unfiltered large-scale datasets often generate outputs misaligned with human preferences, primarily because conventional fine-tuning methods neglect modeling of unconditional and negative-conditioned outputs, thereby undermining the effectiveness of Classifier-Free Guidance (CFG). Method: We propose Diffusion-Negative Preference Optimization (Diffusion-NPO), the first systematic framework to explicitly model negative preferences. It enables lightweight fine-tuning—without requiring new data or changes to training paradigms—to jointly optimize both unconditional and negative-conditioned output distributions. Contribution/Results: Diffusion-NPO is compatible with SD1.5, SDXL, and video diffusion models. It consistently improves human preference scores across multiple benchmarks (+4.2%–9.7%), significantly enhances CFG guidance strength and robustness, and delivers uniform gains on images, videos, and already-optimized models. This work establishes an efficient, general-purpose fine-tuning paradigm for preference alignment in diffusion models.

Technology Category

Application Category

📝 Abstract

Diffusion models have made substantial advances in image generation, yet models trained on large, unfiltered datasets often yield outputs misaligned with human preferences. Numerous methods have been proposed to fine-tune pre-trained diffusion models, achieving notable improvements in aligning generated outputs with human preferences. However, we argue that existing preference alignment methods neglect the critical role of handling unconditional/negative-conditional outputs, leading to a diminished capacity to avoid generating undesirable outcomes. This oversight limits the efficacy of classifier-free guidance~(CFG), which relies on the contrast between conditional generation and unconditional/negative-conditional generation to optimize output quality. In response, we propose a straightforward but versatile effective approach that involves training a model specifically attuned to negative preferences. This method does not require new training strategies or datasets but rather involves minor modifications to existing techniques. Our approach integrates seamlessly with models such as SD1.5, SDXL, video diffusion models and models that have undergone preference optimization, consistently enhancing their alignment with human preferences.

Problem

Research questions and friction points this paper is trying to address.

Aligning diffusion model outputs with human preferences

Handling unconditional/negative-conditional outputs effectively

Enhancing classifier-free guidance for better generation quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

Trains model for negative preferences alignment

Enhances classifier-free guidance effectiveness

Requires minimal modification to existing techniques

🔎 Similar Papers

Training-free Diffusion Model Alignment with Sampling Demons