🤖 AI Summary
This work investigates the origin of generalization in flow matching (FM), specifically testing the hypothesis that target distribution randomness drives generalization. Through theoretical analysis and systematic experiments, we formally disprove this hypothesis: in high dimensions, the closed-form deterministic loss is mathematically equivalent to the stochastic loss, and generalization performance is independent of target randomness. We propose a closed-form FM framework based on deterministic ordinary differential equations (ODEs) and validate it on CIFAR-10 and ImageNet. Empirical results show that state-of-the-art (SOTA) models retain or improve Fréchet Inception Distance (FID) when trained with the deterministic loss—demonstrating that generalization stems from model architecture design rather than noise injection. Our work establishes the theoretical soundness of closed-form FM and reveals its competitive, often superior, performance in image generation compared to stochastic counterparts.
📝 Abstract
Modern deep generative models can now produce high-quality synthetic samples that are often indistinguishable from real training data. A growing body of research aims to understand why recent methods -- such as diffusion and flow matching techniques -- generalize so effectively. Among the proposed explanations are the inductive biases of deep learning architectures and the stochastic nature of the conditional flow matching loss. In this work, we rule out the latter -- the noisy nature of the loss -- as a primary contributor to generalization in flow matching. First, we empirically show that in high-dimensional settings, the stochastic and closed-form versions of the flow matching loss yield nearly equivalent losses. Then, using state-of-the-art flow matching models on standard image datasets, we demonstrate that both variants achieve comparable statistical performance, with the surprising observation that using the closed-form can even improve performance.