🤖 AI Summary
Existing diffusion-based video super-resolution (VSR) methods suffer from high latency and poor streamability due to reliance on future frames and multi-step denoising. To address this, we propose the first causal, streamable diffusion VSR framework. Our approach introduces three key innovations: (1) a causally constrained diffusion model that enables online inference using only past frames; (2) a four-stage distilled denoiser integrated with an autoregressive temporal guidance (ARTG) module to enhance temporal consistency; and (3) a lightweight temporal-aware decoder incorporating a temporal propagation module (TPM), balancing efficiency and reconstruction quality. On an RTX 4090 GPU, our method processes a single 720p frame in just 0.328 seconds—reducing initial latency from over 4600 seconds to 0.328 seconds—and achieves a 130× speedup over state-of-the-art online VSR methods, alongside a 0.095 improvement in LPIPS.
📝 Abstract
Diffusion-based video super-resolution (VSR) methods achieve strong perceptual quality but remain impractical for latency-sensitive settings due to reliance on future frames and expensive multi-step denoising. We propose Stream-DiffVSR, a causally conditioned diffusion framework for efficient online VSR. Operating strictly on past frames, it combines a four-step distilled denoiser for fast inference, an Auto-regressive Temporal Guidance (ARTG) module that injects motion-aligned cues during latent denoising, and a lightweight temporal-aware decoder with a Temporal Processor Module (TPM) that enhances detail and temporal coherence. Stream-DiffVSR processes 720p frames in 0.328 seconds on an RTX4090 GPU and significantly outperforms prior diffusion-based methods. Compared with the online SOTA TMP, it boosts perceptual quality (LPIPS +0.095) while reducing latency by over 130x. Stream-DiffVSR achieves the lowest latency reported for diffusion-based VSR, reducing initial delay from over 4600 seconds to 0.328 seconds, thereby making it the first diffusion VSR method suitable for low-latency online deployment. Project page: https://jamichss.github.io/stream-diffvsr-project-page/