π€ AI Summary
This work addresses the inefficiency of generative flow matching models, whose sequential denoising process incurs high inference costs, while existing acceleration methods suffer from poor generalization and reliance on retraining. To overcome these limitations, we propose FastFlowβa plug-and-play, adaptive inference framework that, for the first time, introduces a multi-armed bandit mechanism into flow matching to dynamically determine the number of denoising steps to skip. By integrating finite-difference velocity estimation with trajectory extrapolation, FastFlow achieves task-agnostic acceleration without any additional computation or model retraining. Experiments across image generation, video generation, and editing tasks demonstrate that FastFlow delivers over 2.6Γ inference speedup while preserving generation quality.
π Abstract
Flow-matching models deliver state-of-the-art fidelity in image and video generation, but the inherent sequential denoising process renders them slower. Existing acceleration methods like distillation, trajectory truncation, and consistency approaches are static, require retraining, and often fail to generalize across tasks. We propose FastFlow, a plug-and-play adaptive inference framework that accelerates generation in flow matching models. FastFlow identifies denoising steps that produce only minor adjustments to the denoising path and approximates them without using the full neural network models used for velocity predictions. The approximation utilizes finite-difference velocity estimates from prior predictions to efficiently extrapolate future states, enabling faster advancements along the denoising path at zero compute cost. This enables skipping computation at intermediary steps. We model the decision of how many steps to safely skip before requiring a full model computation as a multi-armed bandit problem. The bandit learns the optimal skips to balance speed with performance. FastFlow integrates seamlessly with existing pipelines and generalizes across image generation, video generation, and editing tasks. Experiments demonstrate a speedup of over 2.6x while maintaining high-quality outputs. The source code for this work can be found at https://github.com/Div290/FastFlow.