🤖 AI Summary
This work investigates the geometric structure of latent spaces in generative diffusion models under the manifold hypothesis. To address the fundamental question—“Why do diffusion models avoid manifold overfitting?”—we analyze the eigenvalue and singular value spectra, along with the spectral gap, of the score function’s Jacobian matrix to characterize the existence and intrinsic dimensionality of underlying submanifolds. We discover, for the first time, a three-stage geometric phase transition during generation: a trivial phase (early-time), a manifold-covering phase (mid-time), and a condensation phase (late-time), revealing a functional division of labor. Leveraging tools from differential geometry, random matrix theory, and statistical physics, we derive analytical formulas for the spectral distribution and spectral gap, empirically validating them across multiple diffusion models. Theory and experiment exhibit strong agreement, precisely characterizing the time-scale-dependent decoupling between distribution learning and manifold-geometric alignment.
📝 Abstract
In this paper, we investigate the latent geometry of generative diffusion models under the manifold hypothesis. For this purpose, we analyze the spectrum of eigenvalues (and singular values) of the Jacobian of the score function, whose discontinuities (gaps) reveal the presence and dimensionality of distinct sub-manifolds. Using a statistical physics approach, we derive the spectral distributions and formulas for the spectral gaps under several distributional assumptions, and we compare these theoretical predictions with the spectra estimated from trained networks. Our analysis reveals the existence of three distinct qualitative phases during the generative process: a trivial phase; a manifold coverage phase where the diffusion process fits the distribution internal to the manifold; a consolidation phase where the score becomes orthogonal to the manifold and all particles are projected on the support of the data. This `division of labor' between different timescales provides an elegant explanation of why generative diffusion models are not affected by the manifold overfitting phenomenon that plagues likelihood-based models, since the internal distribution and the manifold geometry are produced at different time points during generation.