🤖 AI Summary
Diffusion models are prone to memorizing training samples—especially on small datasets—while existing unlearning methods often degrade generation quality. This work identifies low-noise scales as the primary source of memorization and proposes Ambient Diffusion, a novel paradigm that trains DDPMs exclusively at high-noise scales. By initializing with large Gaussian noise and applying noise truncation during sampling, Ambient Diffusion decouples high-frequency detail reconstruction from semantic content generation, enabling joint optimization of fidelity and low memorization. The method is compatible with both text-conditional and unconditional generation and builds upon standard UNet architectures without architectural modifications. Experiments on CIFAR-10 show a 92% reduction in memorization rate, while maintaining or improving FID and CLIP-Score. Furthermore, Ambient Diffusion consistently outperforms baselines across multiple dataset sizes, demonstrating robustness and scalability.
📝 Abstract
There is strong empirical evidence that the state-of-the-art diffusion modeling paradigm leads to models that memorize the training set, especially when the training set is small. Prior methods to mitigate the memorization problem often lead to a decrease in image quality. Is it possible to obtain strong and creative generative models, i.e., models that achieve high generation quality and low memorization? Despite the current pessimistic landscape of results, we make significant progress in pushing the trade-off between fidelity and memorization. We first provide theoretical evidence that memorization in diffusion models is only necessary for denoising problems at low noise scales (usually used in generating high-frequency details). Using this theoretical insight, we propose a simple, principled method to train the diffusion models using noisy data at large noise scales. We show that our method significantly reduces memorization without decreasing the image quality, for both text-conditional and unconditional models and for a variety of data availability settings.