Distribution learning via neural differential equations: minimal energy regularization and approximation theory

📅 2025-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the theoretical foundations and practical implementation of Neural Ordinary Differential Equations (Neural ODEs) for approximating complex probability distributions, with applications to generative modeling, density estimation, and Bayesian inference—specifically addressing Wasserstein and KL divergence minimization. We propose a time-varying velocity field learning framework regularized by minimum energy, establishing for the first time the equivalence between displacement interpolation and minimum-energy ODE solutions. We derive explicit polynomial upper bounds on the C^k-norm of the learned velocity field. Most notably, we obtain the first dimension–accuracy trade-off characterization for Neural ODE distribution approximation: for any ε > 0, Wasserstein or KL error ≤ ε is achievable with a network size explicitly bounded in terms of ε, data dimension d, and the C^k-smoothness of the target density; moreover, the regularized objective remains uniformly bounded.

Technology Category

Application Category

📝 Abstract
Neural ordinary differential equations (ODEs) provide expressive representations of invertible transport maps that can be used to approximate complex probability distributions, e.g., for generative modeling, density estimation, and Bayesian inference. We show that for a large class of transport maps $T$, there exists a time-dependent ODE velocity field realizing a straight-line interpolation $(1-t)x + tT(x)$, $t in [0,1]$, of the displacement induced by the map. Moreover, we show that such velocity fields are minimizers of a training objective containing a specific minimum-energy regularization. We then derive explicit upper bounds for the $C^k$ norm of the velocity field that are polynomial in the $C^k$ norm of the corresponding transport map $T$; in the case of triangular (Knothe--Rosenblatt) maps, we also show that these bounds are polynomial in the $C^k$ norms of the associated source and target densities. Combining these results with stability arguments for distribution approximation via ODEs, we show that Wasserstein or Kullback--Leibler approximation of the target distribution to any desired accuracy $epsilon>0$ can be achieved by a deep neural network representation of the velocity field whose size is bounded explicitly in terms of $epsilon$, the dimension, and the smoothness of the source and target densities. The same neural network ansatz yields guarantees on the value of the regularized training objective.
Problem

Research questions and friction points this paper is trying to address.

Neural ODEs for distribution learning
Minimal energy regularization in transport maps
Neural network bounds for approximation accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural ODEs for distribution learning
Minimal energy regularization technique
Polynomial bounds on velocity fields
🔎 Similar Papers
No similar papers found.
Y
Youssef M. Marzouk
Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Z
Zhi Ren
Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Jakob Zech
Jakob Zech
Heidelberg University
High-dimensional ProblemsSciMLUQNumerical AnalysisStatistical Inference