🤖 AI Summary
This work addresses the challenge of fine-tuning flow matching models under data scarcity, distribution shift, or efficiency constraints, where conventional fine-tuning often degrades the accuracy and efficiency acquired during pretraining. To overcome this, the authors propose a Gradual Fine-Tuning (GFT) framework that constructs a continuous optimization path from the pretrained model to the target distribution via a temperature scheduling mechanism, enabling stable and efficient adaptation. GFT is compatible with arbitrary couplings—including optimal transport—and provides theoretical convergence guarantees for both marginal and conditional variants. Experimental results demonstrate that GFT substantially improves training stability, shortens the probability path to accelerate inference, and maintains generation quality on par with standard fine-tuning.
📝 Abstract
Fine-tuning flow matching models is a central challenge in settings with limited data, evolving distributions, or strict efficiency demands, where unconstrained fine-tuning can erode the accuracy and efficiency gains learned during pretraining. Prior work has produced theoretical guarantees and empirical advances for reward-based fine-tuning formulations, but these methods often impose restrictions on permissible drift structure or training techniques. In this work, we propose Gradual Fine-Tuning (GFT), a principled framework for fine-tuning flow-based generative models when samples from the target distribution are available. For stochastic flows, GFT defines a temperature-controlled sequence of intermediate objectives that smoothly interpolate between the pretrained and target drifts, approaching the true target as the temperature approaches zero. We prove convergence results for both marginal and conditional GFT objectives, enabling the use of suitable (e.g., optimal transport) couplings during GFT while preserving correctness. Empirically, GFT improves convergence stability and shortens probability paths, resulting in faster inference, while maintaining generation quality comparable to standard fine-tuning. Our results position GFT as a theoretically grounded and practically effective alternative for scalable adaptation of flow matching models under distribution shift.