🤖 AI Summary
Addressing three key challenges in single-step diffusion-based super-resolution—low fidelity, insufficient activation of generative priors in localized regions, and text–image semantic misalignment—this paper proposes Controllable One-step Diffusion Super-Resolution (CODSR). Methodologically, CODSR introduces a novel lossless feature modulation mechanism guided by low-quality (LQ) inputs to preserve degradation information; designs a region-adaptive generative prior attention module to enhance local discriminability; and incorporates a text–image semantic alignment guidance module for fine-grained prompt-driven control. While retaining the inference efficiency of single-step sampling, CODSR achieves state-of-the-art perceptual quality and attains competitive fidelity metrics (e.g., PSNR and LPIPS). This work establishes a new paradigm for controllable, efficient, and high-fidelity diffusion-based super-resolution.
📝 Abstract
Recent diffusion-based one-step methods have shown remarkable progress in the field of image super-resolution, yet they remain constrained by three critical limitations: (1) inferior fidelity performance caused by the information loss from compression encoding of low-quality (LQ) inputs; (2) insufficient region-discriminative activation of generative priors; (3) misalignment between text prompts and their corresponding semantic regions. To address these limitations, we propose CODSR, a controllable one-step diffusion network for image super-resolution. First, we propose an LQ-guided feature modulation module that leverages original uncompressed information from LQ inputs to provide high-fidelity conditioning for the diffusion process. We then develop a region-adaptive generative prior activation method to effectively enhance perceptual richness without sacrificing local structural fidelity. Finally, we employ a text-matching guidance strategy to fully harness the conditioning potential of text prompts. Extensive experiments demonstrate that CODSR achieves superior perceptual quality and competitive fidelity compared with state-of-the-art methods with efficient one-step inference.