Solving Video Inverse Problems Using Image Diffusion Models

📅 2024-09-04

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Existing diffusion-based methods for spatiotemporal inverse problems—such as video super-resolution and deblurring—are hindered by the high training cost and poor generalization of dedicated video diffusion models. Method: We propose a novel fine-tuning-free paradigm that leverages only a pre-trained image diffusion model. Specifically, we map the video’s temporal dimension to the batch dimension and introduce a noise-synchronized batch-consistency sampling strategy to preserve inter-frame temporal coherence. Building upon this, we design an iterative reverse-diffusion framework integrating a Decomposed Diffusion Sampler (DDS), spatiotemporal batch optimization, and noise synchronization constraints. Contribution/Results: Our approach eliminates the need for costly video diffusion model training while achieving state-of-the-art performance across multiple video inverse tasks. It significantly improves both reconstruction fidelity and spatiotemporal consistency, demonstrating strong generalization without task-specific adaptation.

Technology Category

Application Category

📝 Abstract

Recently, diffusion model-based inverse problem solvers (DIS) have emerged as state-of-the-art approaches for addressing inverse problems, including image super-resolution, deblurring, inpainting, etc. However, their application to video inverse problems arising from spatio-temporal degradation remains largely unexplored due to the challenges in training video diffusion models. To address this issue, here we introduce an innovative video inverse solver that leverages only image diffusion models. Specifically, by drawing inspiration from the success of the recent decomposed diffusion sampler (DDS), our method treats the time dimension of a video as the batch dimension of image diffusion models and solves spatio-temporal optimization problems within denoised spatio-temporal batches derived from each image diffusion model. Moreover, we introduce a batch-consistent diffusion sampling strategy that encourages consistency across batches by synchronizing the stochastic noise components in image diffusion models. Our approach synergistically combines batch-consistent sampling with simultaneous optimization of denoised spatio-temporal batches at each reverse diffusion step, resulting in a novel and efficient diffusion sampling strategy for video inverse problems. Experimental results demonstrate that our method effectively addresses various spatio-temporal degradations in video inverse problems, achieving state-of-the-art reconstructions. Project page: https://svi-diffusion.github.io/

Problem

Research questions and friction points this paper is trying to address.

Applies image diffusion models to video problems.

Improves video reconstruction with spatio-temporal optimization.

Ensures batch consistency in diffusion sampling strategy.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages image diffusion models

Batch-consistent diffusion sampling strategy

Optimizes spatio-temporal batches simultaneously

🔎 Similar Papers

No similar papers found.