🤖 AI Summary
This work addresses the high computational cost of Classifier-Free Guidance (CFG) in flow matching, which typically requires two forward passes per sampling step. The authors propose a single-forward inference method that achieves high-quality conditional generation by modulating only the initial latent state, thereby circumventing explicit velocity field extrapolation during sampling. Built upon a first-order approximation of CFG, the approach introduces a heteroscedastic prior that jointly models the mean and variance, enabling adaptive loss attenuation and enhanced robustness. Experimental results demonstrate that the proposed framework matches standard CFG in terms of generation fidelity and prompt alignment while reducing inference latency by approximately 50%.
📝 Abstract
Classifier-Free Guidance (CFG) is essential for high-fidelity conditional generation in flow matching, yet it imposes significant computational overhead by requiring dual forward passes at each sampling step. In this work, we address this bottleneck by introducing \textbf{P-Guide}, a framework that achieves high-quality guidance through a single inference pass by modulating only the initial latent state. We further show that, under a first-order approximation, P-Guide is equivalent to CFG in the sense that it steers generation from the prior space, without requiring explicit velocity field extrapolation during sampling. We consider both homoscedastic and \textbf{heteroscedastic} priors, and find that jointly modeling the mean and variance enables adaptive loss attenuation and improved robustness to data uncertainty. Extensive experiments demonstrate that P-Guide reduces inference latency by approximately 50\% while maintaining fidelity and prompt alignment competitive with standard dual-pass CFG baselines.