🤖 AI Summary
This work demonstrates that diffusion models, score-based generative models, and flow matching methods—despite their apparent formal differences—share a unified continuous-time generative mechanism. By constructing a measure-theoretic framework, the paper unifies these approaches as learning time-dependent vector fields that transport a reference distribution to the data distribution, with distributional evolution governed by the continuity equation and the Fokker–Planck equation. It establishes, for the first time under a common perspective, the equivalence and distinctions among the three paradigms, clarifies the relationship between probability flow ODEs and stochastic backward dynamics, and identifies flow matching as essentially a velocity field regression problem. The study further provides a systematic comparison of objective functions, sampling strategies, and discretization errors, links the framework to Schrödinger bridges and entropy-regularized optimal transport, and summarizes theoretical guarantees and open challenges regarding approximation capacity, stability, and scalability.
📝 Abstract
We survey continuous-time generative modeling methods based on transporting a simple reference distribution to a data distribution via stochastic or deterministic dynamics. We present a unified framework in which diffusion models, score-based generative models, and flow matching are instances of learning a time-dependent vector field that induces a family of marginals $(ρ_t)_{t \in [0,1]}$ governed by continuity and Fokker-Planck equations. Such a unified theory is timely because these methods are converging methodologically, yet fragmented notation and competing derivations continue to obscure their shared structure and the practical tradeoffs governing sampling, stability, and computation. Within this framework, we (i) derive reverse-time sampling for diffusion and score-based models as controlled stochastic dynamics, (ii) show that the probability flow ODE yields identical marginals and connects diffusion to likelihood-based normalizing flows, and (iii) interpret flow matching as direct regression of the velocity field under a chosen interpolation, clarifying when it coincides with or differs from score-based training. We compare objectives, sampling schemes, and discretization errors under unified notation, discuss connections to Schrodinger bridges and entropic optimal transport, and summarize theoretical guarantees and open problems on approximation, stability, and scalability.