SIR-DIFF: Sparse Image Sets Restoration with Multi-View Diffusion Model

📅 2025-03-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the joint restoration of multiple degraded images of the same scene—e.g., suffering from motion blur and low resolution—by proposing the first multi-view diffusion model tailored for sparse image collections. Methodologically, it introduces multi-view geometric consistency into the diffusion framework for the first time, incorporating a cross-image feature alignment module and a 3D consistency regularization loss, while leveraging implicit scene priors to enable geometry-aware joint deblurring and super-resolution. Compared to single-image or video-based approaches, our model achieves superior reconstruction quality and inter-view consistency, attaining state-of-the-art performance across multiple benchmarks. Moreover, the generated outputs effectively support downstream 3D reconstruction and camera pose estimation, demonstrating the efficacy and generalization capability of multi-view diffusion modeling for ill-posed inverse problems.

Technology Category

Application Category

📝 Abstract
The computer vision community has developed numerous techniques for digitally restoring true scene information from single-view degraded photographs, an important yet extremely ill-posed task. In this work, we tackle image restoration from a different perspective by jointly denoising multiple photographs of the same scene. Our core hypothesis is that degraded images capturing a shared scene contain complementary information that, when combined, better constrains the restoration problem. To this end, we implement a powerful multi-view diffusion model that jointly generates uncorrupted views by extracting rich information from multi-view relationships. Our experiments show that our multi-view approach outperforms existing single-view image and even video-based methods on image deblurring and super-resolution tasks. Critically, our model is trained to output 3D consistent images, making it a promising tool for applications requiring robust multi-view integration, such as 3D reconstruction or pose estimation.
Problem

Research questions and friction points this paper is trying to address.

Restores true scene information from degraded photographs
Uses multi-view images for better image restoration
Outputs 3D consistent images for robust applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-view diffusion model for joint denoising
Extracts complementary information from degraded images
Outputs 3D consistent images for robust integration
🔎 Similar Papers
No similar papers found.