On The Hidden Biases of Flow Matching Samplers

📅 2025-12-18

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

This work identifies inherent energy and structural biases in empirical flow matching (FM) samplers: their empirical minimizers almost surely fail to constitute gradient fields, resulting in suboptimal velocity field energy. We provide the first rigorous proof of this non-gradient property and establish a novel theoretical framework characterizing source-distribution–driven kinetic energy decay—exponential concentration (both instantaneous and integrated) for Gaussian sources, and polynomial decay for heavy-tailed sources; crucially, this behavior is governed by the source distribution, not the target data. Our methodology integrates empirical FM modeling, optimal transport analysis, kinetic energy concentration inequalities, and probabilistic tail characterization. These results yield the first quantitative characterization of bias in FM samplers and reveal the fundamental role of source distribution selection in shaping sampling dynamics.

Technology Category

Application Category

📝 Abstract

We study the implicit bias of flow matching (FM) samplers via the lens of empirical flow matching. Although population FM may produce gradient-field velocities resembling optimal transport (OT), we show that the empirical FM minimizer is almost never a gradient field, even when each conditional flow is. Consequently, empirical FM is intrinsically energetically suboptimal. In view of this, we analyze the kinetic energy of generated samples. With Gaussian sources, both instantaneous and integrated kinetic energies exhibit exponential concentration, while heavy-tailed sources lead to polynomial tails. These behaviors are governed primarily by the choice of source distribution rather than the data. Overall, these notes provide a concise mathematical account of the structural and energetic biases arising in empirical FM.

Problem

Research questions and friction points this paper is trying to address.

Analyzes implicit biases in flow matching samplers

Shows empirical flow matching is energetically suboptimal

Examines kinetic energy concentration in generated samples

Innovation

Methods, ideas, or system contributions that make the work stand out.

Empirical flow matching minimizes non-gradient fields

Kinetic energy concentrates exponentially with Gaussian sources

Source distribution primarily governs energetic biases

🔎 Similar Papers

An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning