VeCoR - Velocity Contrastive Regularization for Flow Matching

📅 2025-11-24

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Flow matching (FM) suffers from error accumulation along integration trajectories during velocity field learning, causing generated samples to deviate from the data manifold—especially under low-step sampling or with lightweight models, leading to substantial degradation in sample quality. To address this, we propose a bidirectional attraction-repulsion training paradigm, introducing Velocity Contrastive Regularization (VeCoR), which imposes dual supervision on the velocity field: alignment of positive velocity pairs and repulsion of negative ones. This upgrades the conventional unidirectional attraction objective to a geometrically grounded bidirectional constraint, thereby stabilizing trajectory evolution and enhancing manifold consistency. Experiments on text-to-image generation over ImageNet-1K and MS-COCO demonstrate relative FID improvements of 22–35% (ImageNet) and 32% (COCO) over FM baselines, alongside accelerated convergence and improved training stability. Our key contribution is the first integration of contrastive learning into FM-based velocity field optimization, significantly boosting perceptual fidelity and generalization under low computational overhead.

Technology Category

Application Category

📝 Abstract

Flow Matching (FM) has recently emerged as a principled and efficient alternative to diffusion models. Standard FM encourages the learned velocity field to follow a target direction; however, it may accumulate errors along the trajectory and drive samples off the data manifold, leading to perceptual degradation, especially in lightweight or low-step configurations. To enhance stability and generalization, we extend FM into a balanced attract-repel scheme that provides explicit guidance on both "where to go" and "where not to go." To be formal, we propose extbf{Velocity Contrastive Regularization (VeCoR)}, a complementary training scheme for flow-based generative modeling that augments the standard FM objective with contrastive, two-sided supervision. VeCoR not only aligns the predicted velocity with a stable reference direction (positive supervision) but also pushes it away from inconsistent, off-manifold directions (negative supervision). This contrastive formulation transforms FM from a purely attractive, one-sided objective into a two-sided training signal, regularizing trajectory evolution and improving perceptual fidelity across datasets and backbones. On ImageNet-1K 256$ imes$256, VeCoR yields 22% and 35% relative FID reductions on SiT-XL/2 and REPA-SiT-XL/2 backbones, respectively, and achieves further FID gains (32% relative) on MS-COCO text-to-image generation, demonstrating consistent improvements in stability, convergence, and image quality, particularly in low-step and lightweight settings. Project page: https://p458732.github.io/VeCoR_Project_Page/

Problem

Research questions and friction points this paper is trying to address.

Improves flow matching stability by preventing trajectory deviation errors

Enhances perceptual fidelity through contrastive velocity regularization

Addresses performance degradation in lightweight and low-step configurations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Velocity Contrastive Regularization enhances Flow Matching

Two-sided supervision attracts positive and repels negative directions

Improves stability and image quality in lightweight configurations

🔎 Similar Papers

No similar papers found.