TDM: Temporally-Consistent Diffusion Model for All-in-One Real-World Video Restoration

📅 2025-01-04

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Existing video restoration methods require task-specific models for distinct degradations (e.g., blur, noise, compression artifacts), struggling to jointly preserve content fidelity and temporal consistency. This paper introduces the first diffusion-based, all-in-one framework for real-world video restoration, capable of jointly addressing multiple degradation types within a single model. Our method leverages fine-tuned ControlNet to drive a pre-trained Stable Diffusion backbone—eliminating the need for task-specific architectures. Key innovations include: (1) a Task-Prompt Guided (TPG) training strategy enabling controllable, degradation-aware restoration; and (2) DDIM inversion coupled with Sliding-Window Cross-Frame Attention (SW-CFA) to jointly optimize spatial detail and temporal coherence. Evaluated across five real-world video restoration tasks, our approach consistently surpasses state-of-the-art methods, demonstrating strong generalization, high robustness to diverse degradations, and superior inter-frame consistency—establishing the first high-quality, unified video restoration pipeline.

Technology Category

Application Category

📝 Abstract

In this paper, we propose the first diffusion-based all-in-one video restoration method that utilizes the power of a pre-trained Stable Diffusion and a fine-tuned ControlNet. Our method can restore various types of video degradation with a single unified model, overcoming the limitation of standard methods that require specific models for each restoration task. Our contributions include an efficient training strategy with Task Prompt Guidance (TPG) for diverse restoration tasks, an inference strategy that combines Denoising Diffusion Implicit Models~(DDIM) inversion with a novel Sliding Window Cross-Frame Attention (SW-CFA) mechanism for enhanced content preservation and temporal consistency, and a scalable pipeline that makes our method all-in-one to adapt to different video restoration tasks. Through extensive experiments on five video restoration tasks, we demonstrate the superiority of our method in generalization capability to real-world videos and temporal consistency preservation over existing state-of-the-art methods. Our method advances the video restoration task by providing a unified solution that enhances video quality across multiple applications.

Problem

Research questions and friction points this paper is trying to address.

Video Restoration

Consistency

Real-time Processing

Innovation

Methods, ideas, or system contributions that make the work stand out.

TDM Supermodel

Video Restoration

ControlNet Enhancement

🔎 Similar Papers

DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models