🤖 AI Summary
This work addresses the theoretical gap in sample complexity analysis for flow-matching generative models. Unlike prior studies relying on empirical risk minimization (ERM) assumptions, we establish the first end-to-end upper bound on sample complexity without such assumptions. Methodologically, we model the continuous flow via ordinary differential equations and parameterize the velocity field using neural networks; we then introduce a triple-error decomposition framework—comprising neural approximation error, statistical error, and optimization error—and rigorously analyze its convergence. Our theoretical analysis shows that $O(varepsilon^{-4})$ samples suffice to achieve $O(varepsilon)$ generative accuracy in the Wasserstein-2 distance. This constitutes the first rigorous, non-ERM-dependent sample complexity guarantee for flow matching, filling a critical theoretical void. Moreover, our result provides foundational insights for efficient training and generalization analysis of flow-based generative models.
📝 Abstract
Flow matching has recently emerged as a promising alternative to diffusion-based generative models, offering faster sampling and simpler training by learning continuous flows governed by ordinary differential equations. Despite growing empirical success, the theoretical understanding of flow matching remains limited, particularly in terms of sample complexity results. In this work, we provide the first analysis of the sample complexity for flow-matching based generative models without assuming access to the empirical risk minimizer (ERM) of the loss function for estimating the velocity field. Under standard assumptions on the loss function for velocity field estimation and boundedness of the data distribution, we show that a sufficiently expressive neural network can learn a velocity field such that with $mathcal{O}(ε^{-4})$ samples, such that the Wasserstein-2 distance between the learned and the true distribution is less than $mathcal{O}(ε)$. The key technical idea is to decompose the velocity field estimation error into neural-network approximation error, statistical error due to the finite sample size, and optimization error due to the finite number of optimization steps for estimating the velocity field. Each of these terms are then handled via techniques that may be of independent interest.