PreciseCache: Precise Feature Caching for Efficient and High-fidelity Video Generation

📅 2026-03-01

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the challenge of deploying video generation models, which are hindered by high computational costs and slow inference speeds. Existing caching approaches often degrade generation quality due to inaccurate identification of redundant features. To overcome this limitation, the authors propose PreciseCache, a novel framework that introduces a Low-Frequency Difference (LFD) metric to accurately detect step-level redundancy and incorporates a BlockCache mechanism for block-level redundancy detection within the network. By selectively skipping only truly redundant computations, PreciseCache preserves generation fidelity while significantly accelerating inference. Experiments across multiple backbone architectures demonstrate an average speedup of 2.6× without compromising output quality, substantially improving the efficiency of video generation pipelines.

Technology Category

Application Category

📝 Abstract

High computational costs and slow inference hinder the practical application of video generation models. While prior works accelerate the generation process through feature caching, they often suffer from notable quality degradation. In this work, we reveal that this issue arises from their inability to distinguish truly redundant features, which leads to the unintended skipping of computations on important features. To address this, we propose \textbf{PreciseCache}, a plug-and-play framework that precisely detects and skips truly redundant computations, thereby accelerating inference without sacrificing quality. Specifically, PreciseCache contains two components: LFCache for step-wise caching and BlockCache for block-wise caching. For LFCache, we compute the Low-Frequency Difference (LFD) between the prediction features of the current step and those from the previous cached step. Empirically, we observe that LFD serves as an effective measure of step-wise redundancy, accurately detecting highly redundant steps whose computation can be skipped through reusing cached features. To further accelerate generation within each non-skipped step, we propose BlockCache, which precisely detects and skips redundant computations at the block level within the network. Extensive experiments on various backbones demonstrate the effectiveness of our PreciseCache, which achieves an average of 2.6x speedup without noticeable quality loss. Source code will be released.

Problem

Research questions and friction points this paper is trying to address.

video generation

feature caching

inference acceleration

quality degradation

redundant computation

Innovation

Methods, ideas, or system contributions that make the work stand out.

PreciseCache

feature caching

video generation acceleration