🤖 AI Summary
This work addresses the challenge of approximating transition probability density functions in high-dimensional diffusion processes, where computational complexity and structural constraints—such as positive definiteness and mass conservation—are difficult to reconcile. The authors propose a novel framework that integrates normalizing flows with the neural Galerkin method, embedding normalizing flows into the neural Galerkin architecture for the first time. This approach represents solutions to the Fokker–Planck equation via learnable transformations and solves high-dimensional PDEs under parametrized initial conditions. The method inherently satisfies physical structure constraints, leverages adaptive sampling to enhance accuracy and efficiency, and yields an ordinary differential equation system governing parameter evolution for efficient online inference. After offline training, the model serves as a high-fidelity, low-cost surrogate for multi-query tasks such as Bayesian inference and diffusion bridge generation.
📝 Abstract
We propose a new Neural Galerkin Normalizing Flow framework to approximate the transition probability density function of a diffusion process by solving the corresponding Fokker-Planck equation with an atomic initial distribution, parametrically with respect to the location of the initial mass. By using Normalizing Flows, we look for the solution as a transformation of the transition probability density function of a reference stochastic process, ensuring that our approximation is structure-preserving and automatically satisfies positivity and mass conservation constraints. By extending Neural Galerkin schemes to the context of Normalizing Flows, we derive a system of ODEs for the time evolution of the Normalizing Flow's parameters. Adaptive sampling routines are used to evaluate the Fokker-Planck residual in meaningful locations, which is of vital importance to address high-dimensional PDEs. Numerical results show that this strategy captures key features of the true solution and enforces the causal relationship between the initial datum and the density function at subsequent times. After completing an offline training phase, online evaluation becomes significantly more cost-effective than solving the PDE from scratch. The proposed method serves as a promising surrogate model, which could be deployed in many-query problems associated with stochastic differential equations, like Bayesian inference, simulation, and diffusion bridge generation.