π€ AI Summary
Existing image restoration methods for Gaussian denoising rely solely on pixel-level fidelity, neglecting consistency across both spatial and frequency domains, which leads to distorted structural and textural details. To address this, we propose SaFaRIβa novel, general-purpose image restoration framework that jointly models data fidelity in both spatial and frequency domains within diffusion models. Its core innovation is the integration of a DCT-domain L1 constraint with conventional spatial reconstruction loss, enabling the first dual-domain consistency regularization during the diffusion process. SaFaRI requires no fine-tuning and supports zero-shot denoising, inpainting, and super-resolution. Evaluated on ImageNet and FFHQ, it achieves state-of-the-art performance: LPIPS improves by 12.3% and FID decreases by 18.7% over prior methods, demonstrating significant gains in perceptual realism and high-frequency detail fidelity.
π Abstract
Diffusion models have recently emerged as a promising framework for Image Restoration (IR), owing to their ability to produce high-quality reconstructions and their compatibility with established methods. Existing methods for solving noisy inverse problems in IR, considers the pixel-wise data-fidelity. In this paper, we propose SaFaRI, a spatial-and-frequency-aware diffusion model for IR with Gaussian noise. Our model encourages images to preserve data-fidelity in both the spatial and frequency domains, resulting in enhanced reconstruction quality. We comprehensively evaluate the performance of our model on a variety of noisy inverse problems, including inpainting, denoising, and super-resolution. Our thorough evaluation demonstrates that SaFaRI achieves state-of-the-art performance on both the ImageNet datasets and FFHQ datasets, outperforming existing zero-shot IR methods in terms of LPIPS and FID metrics.