Efficient Zero-Shot Inpainting with Decoupled Diffusion Guidance

📅 2025-12-20

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Existing zero-shot image inpainting methods rely on surrogate likelihood functions, requiring vector-Jacobian products of the denoiser at each backward sampling step—leading to high memory consumption and slow inference. This paper introduces a decoupled likelihood proxy function, deriving for the first time an analytically tractable Gaussian posterior transition distribution, thereby eliminating backpropagation and vector-Jacobian multiplication entirely. Our method leverages a pre-trained diffusion model and integrates three key components: decoupled diffusion guidance, explicit Gaussian posterior sampling, and backpropagation-free score estimation. Without any fine-tuning, it achieves substantial improvements: up to 3.2× faster inference speed and up to 68% reduction in GPU memory usage. Crucially, it maintains strong observational consistency and high-fidelity visual reconstruction, matching the performance of supervised fine-tuning baselines.

Technology Category

Application Category

📝 Abstract

Diffusion models have emerged as powerful priors for image editing tasks such as inpainting and local modification, where the objective is to generate realistic content that remains consistent with observed regions. In particular, zero-shot approaches that leverage a pretrained diffusion model, without any retraining, have been shown to achieve highly effective reconstructions. However, state-of-the-art zero-shot methods typically rely on a sequence of surrogate likelihood functions, whose scores are used as proxies for the ideal score. This procedure however requires vector-Jacobian products through the denoiser at every reverse step, introducing significant memory and runtime overhead. To address this issue, we propose a new likelihood surrogate that yields simple and efficient to sample Gaussian posterior transitions, sidestepping the backpropagation through the denoiser network. Our extensive experiments show that our method achieves strong observation consistency compared with fine-tuned baselines and produces coherent, high-quality reconstructions, all while significantly reducing inference cost. Code is available at https://github.com/YazidJanati/ding.

Problem

Research questions and friction points this paper is trying to address.

Reduces memory and runtime overhead in zero-shot inpainting

Improves observation consistency without fine-tuning diffusion models

Enables efficient sampling by avoiding backpropagation through denoiser

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decoupled diffusion guidance for zero-shot inpainting

Gaussian posterior transitions without denoiser backpropagation

Efficient sampling reducing memory and runtime overhead

🔎 Similar Papers

No similar papers found.