FastFlow: Accelerating The Generative Flow Matching Models with Bandit Inference

📅 2026-02-11

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This work addresses the inefficiency of generative flow matching models, whose sequential denoising process incurs high inference costs, while existing acceleration methods suffer from poor generalization and reliance on retraining. To overcome these limitations, we propose FastFlow—a plug-and-play, adaptive inference framework that, for the first time, introduces a multi-armed bandit mechanism into flow matching to dynamically determine the number of denoising steps to skip. By integrating finite-difference velocity estimation with trajectory extrapolation, FastFlow achieves task-agnostic acceleration without any additional computation or model retraining. Experiments across image generation, video generation, and editing tasks demonstrate that FastFlow delivers over 2.6× inference speedup while preserving generation quality.

Technology Category

Application Category

📝 Abstract

Flow-matching models deliver state-of-the-art fidelity in image and video generation, but the inherent sequential denoising process renders them slower. Existing acceleration methods like distillation, trajectory truncation, and consistency approaches are static, require retraining, and often fail to generalize across tasks. We propose FastFlow, a plug-and-play adaptive inference framework that accelerates generation in flow matching models. FastFlow identifies denoising steps that produce only minor adjustments to the denoising path and approximates them without using the full neural network models used for velocity predictions. The approximation utilizes finite-difference velocity estimates from prior predictions to efficiently extrapolate future states, enabling faster advancements along the denoising path at zero compute cost. This enables skipping computation at intermediary steps. We model the decision of how many steps to safely skip before requiring a full model computation as a multi-armed bandit problem. The bandit learns the optimal skips to balance speed with performance. FastFlow integrates seamlessly with existing pipelines and generalizes across image generation, video generation, and editing tasks. Experiments demonstrate a speedup of over 2.6x while maintaining high-quality outputs. The source code for this work can be found at https://github.com/Div290/FastFlow.

Problem

Research questions and friction points this paper is trying to address.

flow matching

generative models

inference acceleration

denoising process

computational efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

flow matching

adaptive inference

multi-armed bandit