Coreset-Induced Conditional Velocity Flow Matching

📅 2026-05-12
📈 Citations: 0
Influential: 0
📄 PDF

career value

210K/year
🤖 AI Summary
This work addresses the inefficiency in hierarchical flow matching when mapping from isotropic Gaussian noise to multimodal target velocity distributions. To overcome this limitation, the authors propose constructing a closed-form Gaussian mixture source distribution via entropy-regularized Sinkhorn coresets, replacing the conventional isotropic noise source. This approach eliminates the need for neural samplers and instead employs a lightweight correction flow to learn the residual mapping. Theoretical analysis demonstrates that the proposed source distribution yields a lower optimal transport cost compared to standard noise-based schemes. By integrating conditional velocity flow matching with this efficient source modeling strategy, the method achieves competitive few-step generation performance on MNIST, CIFAR-10, ImageNet-32, and CelebA-HQ benchmarks.
📝 Abstract
We propose Coreset-Induced Conditional Velocity Flow Matching (CCVFM), a generative model that augments hierarchical rectified flow with a data-informed source distribution. Hierarchical flow matching models the full conditional velocity law in velocity space, but its inner flow is asked to transport isotropic Gaussian noise to a multimodal target velocity distribution from scratch. Our key observation is that this inner source can be replaced by a closed-form surrogate built from a coreset of the target. CCVFM first compresses the target into weighted atoms using an entropic Sinkhorn coreset and lifts them to a Gaussian mixture. The induced conditional velocity law is then a closed-form Gaussian mixture that can be sampled without a learned neural sampler. A lightweight correction flow, trained from this exact surrogate source, then refines the remaining surrogate-to-target residual rather than learning an entire noise-to-data map. We prove that the surrogate transport cost equals the target--surrogate Wasserstein gap under an explicit compression assumption, whereas the noise-source analogue has a dimension-scale lower bound. We further characterize the conditional second moment of the direct surrogate-source training target and show that its source-dependent excess is small when the surrogate conditional law is close to the true conditional velocity law in mean and covariance. Empirically, on MNIST, CIFAR-10, ImageNet-32, and CelebA-HQ, the proposed method reaches competitive few-step generation under matched architectures.
Problem

Research questions and friction points this paper is trying to address.

flow matching
conditional velocity
generative modeling
multimodal distribution
Wasserstein distance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Coreset
Conditional Velocity Flow Matching
Rectified Flow
Gaussian Mixture Surrogate
Wasserstein Gap