Training-free Latent Inter-Frame Pruning with Attention Recovery

📅 2026-03-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing video generation models struggle to meet real-time requirements due to high computational latency. This work proposes a training-free latent inter-frame pruning method that accelerates inference by identifying and skipping redundant segments in latent representations, while employing an attention-based approximation mechanism to suppress visual artifacts. The approach achieves significant speedup without compromising generation quality, attaining 12.2 frames per second (FPS) on an NVIDIA A6000—1.45× higher throughput than the baseline of 8.4 FPS. By enabling efficient, high-fidelity video synthesis without model retraining, this method establishes a new paradigm for real-time video generation.

Technology Category

Application Category

📝 Abstract
Current video generation models suffer from high computational latency, making real-time applications prohibitively costly. In this paper, we address this limitation by exploiting the temporal redundancy inherent in video latent patches. To this end, we propose the Latent Inter-frame Pruning with Attention Recovery (LIPAR) framework, which detects and skips recomputing duplicated latent patches. Additionally, we introduce a novel Attention Recovery mechanism that approximates the attention values of pruned tokens, thereby removing visual artifacts arising from naively applying the pruning method. Empirically, our method increases video editing throughput by $1.45\times$, on average achieving 12.2 FPS on an NVIDIA A6000 compared to the baseline 8.4 FPS. The proposed method does not compromise generation quality and can be seamlessly integrated with the model without additional training. Our approach effectively bridges the gap between traditional compression algorithms and modern generative pipelines.
Problem

Research questions and friction points this paper is trying to address.

video generation
computational latency
temporal redundancy
real-time applications
latent patches
Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent Inter-frame Pruning
Attention Recovery
Training-free Acceleration
Temporal Redundancy
Video Generation
🔎 Similar Papers
No similar papers found.