Frequency-Domain Refinement with Multiscale Diffusion for Super Resolution

📅 2024-05-16
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
In single-image super-resolution (SISR), insufficient high-frequency detail generation often leads to artifacts and texture distortion. Existing DDPM-based diffusion models directly predict full-bandwidth high-frequency components and employ only the HR ground truth as the target at each denoising step, causing frequency-domain mismatch and hallucination. To address this, we propose FDDiff, a frequency-domain-guided multi-scale diffusion model. FDDiff introduces a novel wavelet packet-based frequency-domain completion chain that decomposes high-frequency reconstruction into bandwidth-increasing, fine-grained steps. It features a unified multi-scale frequency-domain refinement network for progressive frequency guidance during reverse diffusion. Additionally, it incorporates frequency-domain target scheduling and an end-to-end differentiable high-frequency prediction mechanism. Evaluated on standard benchmarks including Set5 and Set14, FDDiff significantly outperforms existing generative SISR methods, achieving substantial improvements in both reconstruction fidelity and texture realism.

Technology Category

Application Category

📝 Abstract
The performance of single image super-resolution depends heavily on how to generate and complement high-frequency details to low-resolution images. Recently, diffusion-based models exhibit great potential in generating high-quality images for super-resolution tasks. However, existing models encounter difficulties in directly predicting high-frequency information of wide bandwidth by solely utilizing the high-resolution ground truth as the target for all sampling timesteps. To tackle this problem and achieve higher-quality super-resolution, we propose a novel Frequency Domain-guided multiscale Diffusion model (FDDiff), which decomposes the high-frequency information complementing process into finer-grained steps. In particular, a wavelet packet-based frequency complement chain is developed to provide multiscale intermediate targets with increasing bandwidth for reverse diffusion process. Then FDDiff guides reverse diffusion process to progressively complement the missing high-frequency details over timesteps. Moreover, we design a multiscale frequency refinement network to predict the required high-frequency components at multiple scales within one unified network. Comprehensive evaluations on popular benchmarks are conducted, and demonstrate that FDDiff outperforms prior generative methods with higher-fidelity super-resolution results.
Problem

Research questions and friction points this paper is trying to address.

Refining high-frequency details in super-resolution using multiscale diffusion
Reducing hallucination artifacts in diffusion-based super-resolution models
Progressively complementing missing frequency components across multiple scales
Innovation

Methods, ideas, or system contributions that make the work stand out.

Wavelet packet pyramid provides multiscale frequency targets
Guided reverse diffusion progressively complements high-frequency details
Multiscale frequency refinement network predicts components in unified structure
🔎 Similar Papers
No similar papers found.
X
Xingjian Wang
Zhejiang University, Hangzhou, China
L
Li Chai
Zhejiang University, Hangzhou, China
J
Jiming Chen
Zhejiang University, Hangzhou, China