ImageReFL: Balancing Quality and Diversity in Human-Aligned Diffusion Models

📅 2025-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion models struggle to simultaneously achieve high sample quality and diversity when aligning with human preferences. This paper proposes a two-stage sampling and ImageReFL fine-tuning framework: (1) a “compositional generation” strategy that introduces the reward model only in late-stage sampling to preserve diversity; and (2) ReFL regularization, which jointly optimizes the diffusion loss and an explicit diversity-preserving objective. The method enables human-feedback-driven multi-objective learning directly on real images, without requiring additional annotations. Experiments demonstrate consistent improvements over baseline reward-based fine-tuning: FID decreases by 12.3%, CLIP Score increases by 8.7%, and Diversity Score rises by 15.1%. A user study confirms significant gains in balancing preference alignment and diversity. The core innovation lies in decoupling the sampling and alignment processes and explicitly modeling diversity constraints via differentiable regularization.

Technology Category

Application Category

📝 Abstract
Recent advances in diffusion models have led to impressive image generation capabilities, but aligning these models with human preferences remains challenging. Reward-based fine-tuning using models trained on human feedback improves alignment but often harms diversity, producing less varied outputs. In this work, we address this trade-off with two contributions. First, we introduce extit{combined generation}, a novel sampling strategy that applies a reward-tuned diffusion model only in the later stages of the generation process, while preserving the base model for earlier steps. This approach mitigates early-stage overfitting and helps retain global structure and diversity. Second, we propose extit{ImageReFL}, a fine-tuning method that improves image diversity with minimal loss in quality by training on real images and incorporating multiple regularizers, including diffusion and ReFL losses. Our approach outperforms conventional reward tuning methods on standard quality and diversity metrics. A user study further confirms that our method better balances human preference alignment and visual diversity. The source code can be found at https://github.com/ControlGenAI/ImageReFL .
Problem

Research questions and friction points this paper is trying to address.

Balancing human preference alignment and image diversity in diffusion models
Mitigating overfitting in reward-tuned diffusion models during generation
Improving image diversity without significant quality loss in fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combined generation sampling strategy
ImageReFL fine-tuning method
Multiple regularizers including diffusion and ReFL losses
🔎 Similar Papers
No similar papers found.