Real-Time Video Generation with Pyramid Attention Broadcast

📅 2024-08-22

🏛️ arXiv.org

📈 Citations: 16

✨ Influential: 4

career value

228K/year

🤖 AI Summary

Diffusion-based video generation using DiT architectures suffers from high computational redundancy in attention computation, slow inference, and reliance on pretraining. Method: This paper proposes Pyramid Attention Broadcast (PAB), a training-free method that identifies a U-shaped redundancy pattern in attention outputs across diffusion timesteps and designs a pyramid-strided broadcasting mechanism to exploit it. It further introduces hierarchical variance-adaptive scheduling and sequence-level parallel inference to minimize redundant computations. Contribution/Results: PAB achieves 10.5× speedup across three DiT variants without any training overhead, enabling real-time 720p video generation. It preserves generation quality while significantly improving inference efficiency—addressing key bottlenecks in diffusion-based video synthesis.

Technology Category

Application Category

📝 Abstract

We present Pyramid Attention Broadcast (PAB), a real-time, high quality and training-free approach for DiT-based video generation. Our method is founded on the observation that attention difference in the diffusion process exhibits a U-shaped pattern, indicating significant redundancy. We mitigate this by broadcasting attention outputs to subsequent steps in a pyramid style. It applies different broadcast strategies to each attention based on their variance for best efficiency. We further introduce broadcast sequence parallel for more efficient distributed inference. PAB demonstrates up to 10.5x speedup across three models compared to baselines, achieving real-time generation for up to 720p videos. We anticipate that our simple yet effective method will serve as a robust baseline and facilitate future research and application for video generation.

Problem

Research questions and friction points this paper is trying to address.

Video Generation

Efficiency Improvement

Attention Mechanism Optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Pyramid Attention Broadcasting

Video Generation Acceleration

Real-time 720p Video

🔎 Similar Papers

Pyramidal Flow Matching for Efficient Video Generative Modeling