Manifolds, Random Matrices and Spectral Gaps: The geometric phases of generative diffusion

📅 2024-10-08

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

232K/year

🤖 AI Summary

This work investigates the geometric structure of latent spaces in generative diffusion models under the manifold hypothesis. To address the fundamental question—“Why do diffusion models avoid manifold overfitting?”—we analyze the eigenvalue and singular value spectra, along with the spectral gap, of the score function’s Jacobian matrix to characterize the existence and intrinsic dimensionality of underlying submanifolds. We discover, for the first time, a three-stage geometric phase transition during generation: a trivial phase (early-time), a manifold-covering phase (mid-time), and a condensation phase (late-time), revealing a functional division of labor. Leveraging tools from differential geometry, random matrix theory, and statistical physics, we derive analytical formulas for the spectral distribution and spectral gap, empirically validating them across multiple diffusion models. Theory and experiment exhibit strong agreement, precisely characterizing the time-scale-dependent decoupling between distribution learning and manifold-geometric alignment.

Technology Category

Application Category

📝 Abstract

In this paper, we investigate the latent geometry of generative diffusion models under the manifold hypothesis. For this purpose, we analyze the spectrum of eigenvalues (and singular values) of the Jacobian of the score function, whose discontinuities (gaps) reveal the presence and dimensionality of distinct sub-manifolds. Using a statistical physics approach, we derive the spectral distributions and formulas for the spectral gaps under several distributional assumptions, and we compare these theoretical predictions with the spectra estimated from trained networks. Our analysis reveals the existence of three distinct qualitative phases during the generative process: a trivial phase; a manifold coverage phase where the diffusion process fits the distribution internal to the manifold; a consolidation phase where the score becomes orthogonal to the manifold and all particles are projected on the support of the data. This `division of labor' between different timescales provides an elegant explanation of why generative diffusion models are not affected by the manifold overfitting phenomenon that plagues likelihood-based models, since the internal distribution and the manifold geometry are produced at different time points during generation.

Problem

Research questions and friction points this paper is trying to address.

Analyzing spectral gaps in diffusion model Jacobians

Identifying manifold structure via eigenvalue discontinuities

Explaining diffusion models' resistance to manifold overfitting

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzing score function Jacobian spectral gaps

Deriving spectral distributions using statistical physics

Identifying three distinct generative diffusion phases

🔎 Similar Papers

No similar papers found.