๐ค AI Summary
This work addresses the high computational cost of iterative velocity field evaluation in rectified flow models, where existing caching methods suffer from error accumulation and degraded generation quality under large step sizes due to coarse approximations. The authors propose TACache, a novel framework that, for the first time, orthogonally decomposes velocity acceleration along trajectories to disentangle magnitude- and direction-based error sources. TACache introduces a โpost-jump compensationโ mechanism that leverages offline cumulative variance threshold scheduling and online historical directional information to reconstruct skipped-step velocity fields without requiring additional training. Experiments demonstrate that TACache achieves up to 4.14ร acceleration in image generation and 2.11ร in video generation on BAGEL, FLUX.1-dev, and Wan2.1-1.3B, consistently outperforming prior caching approaches across all fidelity metrics.
๐ Abstract
Diffusion and rectified flow (RF) models generate high-fidelity images and videos, but their iterative velocity-field evaluations are computationally expensive. Existing caching methods accelerate sampling by skipping timesteps, yet their coarse approximations introduce accumulated errors over long skip intervals and degrade quality under aggressive acceleration. We propose TACache (Trajectory-Aware Cache), a training-free acceleration framework following a skip-then-compensate paradigm. TACache performs an orthogonal decomposition of discrete velocity acceleration along the RF trajectory into a parallel component and an orthogonal residual, isolating the magnitude and directional sources of per-step approximation error. The framework operates in two stages: offline, cumulative variation thresholds on the magnitude and direction indicators yield the skip schedule and bound how far each skip interval may extend; online, at each skipped step the offline statistics are combined with the sample's historical orthogonal direction to reconstruct the skipped velocity without additional model evaluations. Experiments on BAGEL, FLUX.1-dev, and Wan2.1-1.3B show that TACache achieves up to 4.14 speedup on text-to-image generation and 2.11 speedup on text-to-video generation, with consistent improvements over prior cache-based methods on all reference-based fidelity metrics. Code will be released soon.