Zero-Shot Video Restoration and Enhancement with Assistance of Video Diffusion Models

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing zero-shot video restoration methods, which rely on image diffusion models, often suffer from severe temporal flickering. This work proposes the first general framework that leverages video diffusion models to enhance image-based approaches without requiring additional training. By integrating multi-level fusion of homologous and heterologous latent representations, a COT (Consistency-Oriented Trade-off) fusion ratio strategy, and a temporal-aware post-processing module, the method significantly improves temporal consistency. It is compatible with a wide range of diffusion-based image restoration techniques and consistently outperforms current state-of-the-art methods across multiple zero-shot video enhancement tasks, demonstrating both strong generality and effectiveness.

Technology Category

Application Category

📝 Abstract
Although diffusion-based zero-shot image restoration and enhancement methods have achieved great success, applying them to video restoration or enhancement will lead to severe temporal flickering. In this paper, we propose the first framework that utilizes the rapidly-developed video diffusion model to assist the image-based method in maintaining more temporal consistency for zero-shot video restoration and enhancement. We propose homologous latents fusion, heterogenous latents fusion, and a COT-based fusion ratio strategy to utilize both homologous and heterogenous text-to-video diffusion models to complement the image method. Moreover, we propose temporal-strengthening post-processing to utilize the image-to-video diffusion model to further improve temporal consistency. Our method is training-free and can be applied to any diffusion-based image restoration and enhancement methods. Experimental results demonstrate the superiority of the proposed method.
Problem

Research questions and friction points this paper is trying to address.

zero-shot video restoration
temporal flickering
video enhancement
temporal consistency
diffusion models
Innovation

Methods, ideas, or system contributions that make the work stand out.

video diffusion model
zero-shot video restoration
temporal consistency
latents fusion
training-free framework
🔎 Similar Papers
No similar papers found.
C
Cong Cao
School of Electrical and Information Engineering, Tianjin University, Tianjin, China
H
Huanjing Yue
School of Electrical and Information Engineering, Tianjin University, Tianjin, China
S
Shangbin Xie
School of Electrical and Information Engineering, Tianjin University, Tianjin, China
Xin Liu
Xin Liu
Associate Professor, Lappeenranta University of Technology
Artificial intelligenceAffective ComputingSocial Signal ProcessingEmotion AIrPPG
J
Jingyu Yang
School of Electrical and Information Engineering, Tianjin University, Tianjin, China