AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

📅 2026-05-13

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This work addresses the limitation of existing consistency distillation–based video diffusion models, which suffer performance degradation when increasing sampling steps at test time and thus cannot support arbitrary-step generation. To overcome this, the authors propose AnyFlow, a novel framework that optimizes the entire ODE sampling trajectory by extending the distillation objective from endpoint consistency to flow map transition learning over arbitrary time intervals. AnyFlow further introduces a flow map backward simulation mechanism. This approach is the first to enable video diffusion distillation with support for arbitrary sampling steps. Across model scales ranging from 1.3B to 14B parameters, AnyFlow matches or surpasses state-of-the-art methods in few-step generation while consistently improving performance as the number of sampling steps increases, achieving both high efficiency and strong scalability.

📝 Abstract

Few-step video generation has been significantly advanced by consistency distillation. However, the performance of consistency-distilled models often degrades as more sampling steps are allocated at test time, limiting their effectiveness for any-step video diffusion. This limitation arises because consistency distillation replaces the original probability-flow ODE trajectory with a consistency-sampling trajectory, weakening the desirable test-time scaling behavior of ODE sampling. To address this limitation, we introduce AnyFlow, the first any-step video diffusion distillation framework based on flow maps. Instead of distilling a model for only a few fixed sampling steps, AnyFlow optimizes the full ODE sampling trajectory. To this end, we shift the distillation target from endpoint consistency mapping $(z_{t}\rightarrow z_{0})$ to flow-map transition learning $(z_{t}\rightarrow z_{r})$ over arbitrary time intervals. We further propose Flow Map Backward Simulation, which decomposes a full Euler rollout into shortcut flow-map transitions, enabling efficient on-policy distillation that reduces test-time errors (i.e., discretization error in few-step sampling and exposure bias in causal generation). Extensive experiments across both bidirectional and causal architectures, at scales ranging from 1.3B to 14B parameters, demonstrate that AnyFlow achieves performance matches or surpasses consistency-based counterparts in the few-step regime, while scaling with sampling step budgets.

Problem

Research questions and friction points this paper is trying to address.

any-step video generation

consistency distillation

video diffusion model

sampling scalability

flow map

Innovation

Methods, ideas, or system contributions that make the work stand out.

any-step video generation

flow map distillation

on-policy distillation