🤖 AI Summary
This work investigates the distributional approximation capabilities of bi-Lipschitz normalizing flows under regularity constraints. By adopting the perspective of score-based diffusion models and leveraging the probability flow ordinary differential equation associated with variance-preserving diffusion, it establishes—for the first time—a theoretical link between the Lipschitz regularity of the score function and the induced bi-Lipschitz diffeomorphic transport map. Within this framework, the pullback of the Gaussian distribution is shown to be dense in the space of arbitrary probability densities under the L¹ norm, and deterministic convergence in Kullback–Leibler (KL) divergence is guaranteed—without early stopping—for target densities obtained via Gaussian convolution. Experiments validate the score regularity across diverse target densities, including compactly supported and Gaussian mixture distributions, achieving unified L¹-dense approximation and KL convergence.
📝 Abstract
Many normalizing flow architectures impose regularity constraints, yet their distributional approximation properties are not fully characterized. We study the expressivity of bi-Lipschitz normalizing flows through the lens of score-based diffusion models. For the probability flow ODE of a variance-preserving diffusion, Lipschitz regularity of the score induces a flow of bi-Lipschitz diffeomorphic transport maps. This ODE bridge allows us to analyze the distributional approximation power of bi-Lipschitz normalizing flows and, conversely, derive deterministic convergence guarantees for diffusion-based transport. Our key idea is to use the probability flow ODE to link regularity of the score to regularity of the induced transport maps. We verify score regularity for broad target densities, including compactly supported densities, Gaussian convolutions of compactly supported measures and finite Gaussian mixtures. We obtain a universal distributional approximation result: Gaussian pullbacks induced by bi-Lipschitz variance-preserving transport maps are $L^1$-dense among all probability densities. For Gaussian convolution targets, we further obtain convergence in Kullback-Leibler divergence without early stopping.