🤖 AI Summary
Existing diffusion model tutorials predominantly focus on Euclidean spaces, lacking a unified perspective for discrete state spaces (e.g., categorical data) and failing to systematically elucidate theoretical connections between continuous and discrete diffusion processes.
Method: We propose a self-contained diffusion framework over general state spaces, unifying forward processes via stochastic differential equations (SDEs) and the Fokker–Planck equation for continuous domains, and via continuous-time Markov chains (CTMCs) and the master equation for discrete domains—both formalized through Markov kernels. We derive a unified variational objective (ELBO), explicitly characterizing how distinct noise schedules affect reverse dynamics modeling.
Contribution: This work establishes, for the first time, a rigorous theoretical bridge across continuous and discrete domains. It provides a reusable derivation paradigm, proof toolkit, and pedagogical pathway, delivering a compact, general foundation for both fundamental understanding and algorithmic design of diffusion models.
📝 Abstract
Although diffusion models now occupy a central place in generative modeling, introductory treatments commonly assume Euclidean data and seldom clarify their connection to discrete-state analogues. This article is a self-contained primer on diffusion over general state spaces, unifying continuous domains and discrete/categorical structures under one lens. We develop the discrete-time view (forward noising via Markov kernels and learned reverse dynamics) alongside its continuous-time limits -- stochastic differential equations (SDEs) in $mathbb{R}^d$ and continuous-time Markov chains (CTMCs) on finite alphabets -- and derive the associated Fokker--Planck and master equations. A common variational treatment yields the ELBO that underpins standard training losses. We make explicit how forward corruption choices -- Gaussian processes in continuous spaces and structured categorical transition kernels (uniform, masking/absorbing and more) in discrete spaces -- shape reverse dynamics and the ELBO. The presentation is layered for three audiences: newcomers seeking a self-contained intuitive introduction; diffusion practitioners wanting a global theoretical synthesis; and continuous-diffusion experts looking for an analogy-first path into discrete diffusion. The result is a unified roadmap to modern diffusion methodology across continuous domains and discrete sequences, highlighting a compact set of reusable proofs, identities, and core theoretical principles.