Value Gradient Guidance for Flow Matching Alignment

📅 2025-12-04

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Existing flow-matching models struggle to simultaneously achieve efficient fine-tuning and prior-consistent alignment with human preferences. This work proposes a gradient-matching method grounded in optimal control theory: it guides velocity field optimization using gradients of a value function, thereby integrating first-order reward signals while preserving the pre-trained prior. We introduce a heuristic initialization for the value function and a pre-trained velocity field fine-tuning strategy, enabling efficient, low-resource, preference-aligned customization. Evaluated on Stable Diffusion 3, our approach significantly improves human preference alignment with limited computational overhead, while strictly maintaining the probabilistic interpretability of flow-matching dynamics. The core contribution is the first systematic integration of an optimal control framework into flow-matching fine-tuning—unifying reward-driven optimization and prior constraints within a single principled formulation.

Technology Category

Application Category

📝 Abstract

While methods exist for aligning flow matching models--a popular and effective class of generative models--with human preferences, existing approaches fail to achieve both adaptation efficiency and probabilistically sound prior preservation. In this work, we leverage the theory of optimal control and propose VGG-Flow, a gradient-matching-based method for finetuning pretrained flow matching models. The key idea behind this algorithm is that the optimal difference between the finetuned velocity field and the pretrained one should be matched with the gradient field of a value function. This method not only incorporates first-order information from the reward model but also benefits from heuristic initialization of the value function to enable fast adaptation. Empirically, we show on a popular text-to-image flow matching model, Stable Diffusion 3, that our method can finetune flow matching models under limited computational budgets while achieving effective and prior-preserving alignment.

Problem

Research questions and friction points this paper is trying to address.

Aligning flow matching models with human preferences efficiently

Preserving probabilistic soundness of prior distributions during adaptation

Enabling fast adaptation under limited computational budgets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages optimal control theory for alignment

Matches velocity difference with value gradient

Enables fast adaptation with heuristic initialization

🔎 Similar Papers

No similar papers found.