🤖 AI Summary
Schrödinger Bridge Diffusion Models (SBDMs) suffer from mathematical complexity and poor interpretability, hindering theoretical analysis and practical control.
Method: We propose a unified modeling paradigm by reformulating SBDMs as a principled extension of Variational Autoencoders (VAEs). Leveraging the data processing inequality, we rigorously derive a decomposition of the SBDM objective into two interpretable terms: prior matching (enforcing consistency with the forward process prior) and drift matching (learning the backward dynamics).
Contribution/Results: This decomposition breaks the “black-box” nature of conventional SBDMs, exposing the intrinsic coupling between forward prior constraints and backward dynamical learning. The framework unifies variational inference, stochastic differential equations, and Schrödinger bridge theory, yielding a transparent, modular, and controllable joint learning scheme. Our approach establishes a novel theoretical foundation for designing interpretable, structurally explicit diffusion generative models.
📝 Abstract
Generative diffusion models use time-forward and backward stochastic differential equations to connect the data and prior distributions. While conventional diffusion models (e.g., score-based models) only learn the backward process, more flexible frameworks have been proposed to also learn the forward process by employing the Schr""odinger bridge (SB). However, due to the complexity of the mathematical structure behind SB-type models, we can not easily give an intuitive understanding of their objective function. In this work, we propose a unified framework to construct diffusion models by reinterpreting the SB-type models as an extension of variational autoencoders. In this context, the data processing inequality plays a crucial role. As a result, we find that the objective function consists of the prior loss and drift matching parts.