Manifold Generalization Provably Proceeds Memorization in Diffusion Models

📅 2026-03-24

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

This work challenges the conventional view that high-quality sample generation in diffusion models necessitates accurate full-density estimation, demonstrating instead that learning only a coarse score function suffices. Building on the manifold hypothesis, the authors propose that such models generalize effectively by capturing the geometric structure of the data manifold rather than the entire data distribution. Through a theoretical analysis integrating manifold regularity, nonparametric estimation, and diffusion dynamics, they establish—for the first time—that generalization performance depends on the smoothness of the underlying manifold rather than the regularity of the data density. Moreover, they derive near-parametric rates for the generalization error bound, showing that when the manifold is sufficiently smooth, diffusion models can generate high-fidelity samples at rates surpassing the classical minimax lower bound, without needing to recover the full data distribution.

Technology Category

Application Category

📝 Abstract

Diffusion models often generate novel samples even when the learned score is only \emph{coarse} -- a phenomenon not accounted for by the standard view of diffusion training as density estimation. In this paper, we show that, under the \emph{manifold hypothesis}, this behavior can instead be explained by coarse scores capturing the \emph{geometry} of the data while discarding the fine-scale distributional structure of the population measure~$μ_{\scriptscriptstyle\mathrm{data}}$. Concretely, whereas estimating the full data distribution $μ_{\scriptscriptstyle\mathrm{data}}$ supported on a $k$-dimensional manifold is known to require the classical minimax rate $\tilde{\mathcal{O}}(N^{-1/k})$, we prove that diffusion models trained with coarse scores can exploit the \emph{regularity of the manifold support} and attain a near-parametric rate toward a \emph{different} target distribution. This target distribution has density uniformly comparable to that of~$μ_{\scriptscriptstyle\mathrm{data}}$ throughout any $\tilde{\mathcal{O}}\bigl(N^{-β/(4k)}\bigr)$-neighborhood of the manifold, where $β$ denotes the manifold regularity. Our guarantees therefore depend only on the smoothness of the underlying support, and are especially favorable when the data density itself is irregular, for instance non-differentiable. In particular, when the manifold is sufficiently smooth, we obtain that \emph{generalization} -- formalized as the ability to generate novel, high-fidelity samples -- occurs at a statistical rate strictly faster than that required to estimate the full population distribution~$μ_{\scriptscriptstyle\mathrm{data}}$.

Problem

Research questions and friction points this paper is trying to address.

diffusion models

manifold hypothesis

generalization

density estimation

score-based generative models

Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion models

manifold hypothesis

coarse score