Conditional Image Synthesis with Diffusion Models: A Survey

📅 2024-09-28
🏛️ arXiv.org
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
Conditional image synthesis with diffusion models suffers from a lack of systematic understanding due to architectural complexity, task heterogeneity, and diverse conditional mechanisms. Method: This paper introduces the first unified taxonomy for conditional diffusion modeling, categorizing approaches by *where* conditioning is injected—either into the denoising network architecture or the sampling process—and formalizes three paradigmatic stages: training, reuse, and specialization. It further classifies six mainstream sampling-time conditioning strategies. Contributions: Based on a structured analysis of over 100 works, the paper establishes a comprehensive knowledge framework and open-sources an authoritative resource repository (GitHub Awesome-Conditional-Diffusion-Models). It identifies persistent bottlenecks—including limited generalization, inefficient inference, and coarse-grained control—and proposes principled directions toward scalable, modular, and fine-grained conditional modeling.

Technology Category

Application Category

📝 Abstract
Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, the complexity of diffusion-based modeling, the wide range of image synthesis tasks, and the diversity of conditioning mechanisms present significant challenges for researchers to keep up with rapid developments and to understand the core concepts on this topic. In this survey, we categorize existing works based on how conditions are integrated into the two fundamental components of diffusion-based modeling, $ extit{i.e.}$, the denoising network and the sampling process. We specifically highlight the underlying principles, advantages, and potential challenges of various conditioning approaches during the training, re-purposing, and specialization stages to construct a desired denoising network. We also summarize six mainstream conditioning mechanisms in the sampling process. All discussions are centered around popular applications. Finally, we pinpoint several critical yet still unsolved problems and suggest some possible solutions for future research. Our reviewed works are itemized at https://github.com/zju-pi/Awesome-Conditional-Diffusion-Models.
Problem

Research questions and friction points this paper is trying to address.

Surveying diffusion models for conditional image synthesis challenges
Categorizing conditioning approaches in denoising networks and sampling
Identifying unsolved problems in diffusion-based conditional image generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Categorizes works by condition integration methods
Highlights principles of conditioning approaches
Summarizes six mainstream conditioning mechanisms
🔎 Similar Papers
No similar papers found.