🤖 AI Summary
In conditional generative modeling, existing diffusion and flow-matching approaches map standard Gaussian noise to the conditional data distribution, tightly coupling conditional injection with optimal transport—resulting in high model complexity and inefficient training. To address this, we propose Conditional-Aware Reparameterization Flow (CAR-Flow), a flow-matching framework that introduces lightweight, learnable, condition-aware shifts—either to the source (noise) or target (data) distribution—to dynamically reposition them, thereby explicitly decoupling conditional injection from transport path learning. This reparameterization incurs negligible parameter overhead (+0.6%) while substantially shortening the required probability transport distance. On ImageNet-256, CAR-Flow reduces the FID of SiT-XL/2 from 2.07 to 1.68, demonstrating simultaneous improvements in both training efficiency and generation quality.
📝 Abstract
Conditional generative modeling aims to learn a conditional data distribution from samples containing data-condition pairs. For this, diffusion and flow-based methods have attained compelling results. These methods use a learned (flow) model to transport an initial standard Gaussian noise that ignores the condition to the conditional data distribution. The model is hence required to learn both mass transport and conditional injection. To ease the demand on the model, we propose Condition-Aware Reparameterization for Flow Matching (CAR-Flow) -- a lightweight, learned shift that conditions the source, the target, or both distributions. By relocating these distributions, CAR-Flow shortens the probability path the model must learn, leading to faster training in practice. On low-dimensional synthetic data, we visualize and quantify the effects of CAR. On higher-dimensional natural image data (ImageNet-256), equipping SiT-XL/2 with CAR-Flow reduces FID from 2.07 to 1.68, while introducing less than 0.6% additional parameters.