Bridging Fidelity-Reality with Controllable One-Step Diffusion for Image Super-Resolution

📅 2025-12-15

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Addressing three key challenges in single-step diffusion-based super-resolution—low fidelity, insufficient activation of generative priors in localized regions, and text–image semantic misalignment—this paper proposes Controllable One-step Diffusion Super-Resolution (CODSR). Methodologically, CODSR introduces a novel lossless feature modulation mechanism guided by low-quality (LQ) inputs to preserve degradation information; designs a region-adaptive generative prior attention module to enhance local discriminability; and incorporates a text–image semantic alignment guidance module for fine-grained prompt-driven control. While retaining the inference efficiency of single-step sampling, CODSR achieves state-of-the-art perceptual quality and attains competitive fidelity metrics (e.g., PSNR and LPIPS). This work establishes a new paradigm for controllable, efficient, and high-fidelity diffusion-based super-resolution.

Technology Category

Application Category

📝 Abstract

Recent diffusion-based one-step methods have shown remarkable progress in the field of image super-resolution, yet they remain constrained by three critical limitations: (1) inferior fidelity performance caused by the information loss from compression encoding of low-quality (LQ) inputs; (2) insufficient region-discriminative activation of generative priors; (3) misalignment between text prompts and their corresponding semantic regions. To address these limitations, we propose CODSR, a controllable one-step diffusion network for image super-resolution. First, we propose an LQ-guided feature modulation module that leverages original uncompressed information from LQ inputs to provide high-fidelity conditioning for the diffusion process. We then develop a region-adaptive generative prior activation method to effectively enhance perceptual richness without sacrificing local structural fidelity. Finally, we employ a text-matching guidance strategy to fully harness the conditioning potential of text prompts. Extensive experiments demonstrate that CODSR achieves superior perceptual quality and competitive fidelity compared with state-of-the-art methods with efficient one-step inference.

Problem

Research questions and friction points this paper is trying to address.

Improves fidelity in image super-resolution by using uncompressed low-quality inputs

Enhances region-specific generative priors for better perceptual richness

Aligns text prompts with corresponding semantic regions accurately

Innovation

Methods, ideas, or system contributions that make the work stand out.

LQ-guided feature modulation for high-fidelity conditioning

Region-adaptive generative prior activation for perceptual richness

Text-matching guidance strategy to align prompts with semantics

🔎 Similar Papers

No similar papers found.