Ambient Diffusion Omni: Training Good Models with Bad Data

📅 2025-06-10

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Diffusion models suffer from heavy reliance on high-quality, in-domain training data, limiting their scalability and robustness. Method: This work proposes a novel paradigm leveraging low-quality, synthetic, and out-of-distribution images to enhance model performance. We first establish theoretically that controlled noise injection mitigates the trade-off between data bias and sample scarcity. Building upon the power-law spectral decay and locality properties of natural images, we develop a general robust training framework integrating JPEG compression, motion/Gaussian blur modeling, and stride-aware theoretical error bound analysis across diffusion steps. Contribution/Results: Our approach achieves state-of-the-art FID on ImageNet and significantly improves both fidelity and diversity in text-to-image generation. Theoretically, the framework guarantees optimal bias-variance trade-off at every denoising step of the diffusion process.

Technology Category

Application Category

📝 Abstract

We show how to use low-quality, synthetic, and out-of-distribution images to improve the quality of a diffusion model. Typically, diffusion models are trained on curated datasets that emerge from highly filtered data pools from the Web and other sources. We show that there is immense value in the lower-quality images that are often discarded. We present Ambient Diffusion Omni, a simple, principled framework to train diffusion models that can extract signal from all available images during training. Our framework exploits two properties of natural images -- spectral power law decay and locality. We first validate our framework by successfully training diffusion models with images synthetically corrupted by Gaussian blur, JPEG compression, and motion blur. We then use our framework to achieve state-of-the-art ImageNet FID, and we show significant improvements in both image quality and diversity for text-to-image generative modeling. The core insight is that noise dampens the initial skew between the desired high-quality distribution and the mixed distribution we actually observe. We provide rigorous theoretical justification for our approach by analyzing the trade-off between learning from biased data versus limited unbiased data across diffusion times.

Problem

Research questions and friction points this paper is trying to address.

Improving diffusion models with low-quality synthetic images

Extracting signal from all available images during training

Balancing biased and unbiased data for better performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilize low-quality images for model improvement

Exploit spectral power law and locality

Noise reduces skew in mixed distributions

🔎 Similar Papers

No similar papers found.