Diffusion models under low-noise regime

📅 2025-06-09

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

The denoising behavior and robustness mechanisms of diffusion models in the low-noise regime remain poorly understood. Method: We systematically investigate low-noise diffusion dynamics through controlled experiments on CelebA subsets, analytical studies on Gaussian mixture benchmarks, low-noise dynamical analysis, and score estimation error modeling. Contribution/Results: We discover that—even when high-noise outputs converge—denoising trajectories near the data manifold exhibit strong divergence with respect to training set partitioning, revealing a previously unrecognized sensitivity. This phenomenon arises from the joint constraint imposed by local data geometry (e.g., manifold curvature) and training scale on score estimation accuracy. We quantitatively characterize how data scale, manifold curvature, and objective function design jointly limit low-noise generalization. Our findings establish theoretical criteria and empirical evidence for generation reliability under small perturbations, advancing the interpretability and robustness of diffusion models.

Technology Category

Application Category

📝 Abstract

Recent work on diffusion models proposed that they operate in two regimes: memorization, in which models reproduce their training data, and generalization, in which they generate novel samples. While this has been tested in high-noise settings, the behavior of diffusion models as effective denoisers when the corruption level is small remains unclear. To address this gap, we systematically investigated the behavior of diffusion models under low-noise diffusion dynamics, with implications for model robustness and interpretability. Using (i) CelebA subsets of varying sample sizes and (ii) analytic Gaussian mixture benchmarks, we reveal that models trained on disjoint data diverge near the data manifold even when their high-noise outputs converge. We quantify how training set size, data geometry, and model objective choice shape denoising trajectories and affect score accuracy, providing insights into how these models actually learn representations of data distributions. This work starts to address gaps in our understanding of generative model reliability in practical applications where small perturbations are common.

Problem

Research questions and friction points this paper is trying to address.

Study diffusion models' behavior under low-noise conditions

Investigate memorization vs generalization in denoising tasks

Analyze training data impact on model robustness and interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Investigates diffusion models in low-noise regimes

Uses CelebA subsets and Gaussian benchmarks

Analyzes training set size and data geometry effects

🔎 Similar Papers

Operator-informed score matching for Markov diffusion models