🤖 AI Summary
To address the high computational cost, excessive inference steps, and insufficient detail recovery of pre-trained diffusion models in real-world image super-resolution (Real-ISR), this paper proposes the first one-step diffusion distillation framework. Our method leverages a pre-trained text-to-image diffusion model and integrates score distillation, explicit distribution modeling, and single-step sampling. Key contributions include: (1) a target score distillation mechanism that jointly exploits diffusion priors and real-image references to enhance reconstruction fidelity; and (2) a distribution-aware sampling module that improves fine-grained gradient accessibility, thereby boosting texture and structural recovery. Extensive experiments demonstrate state-of-the-art performance across multiple Real-ISR benchmarks. Our approach achieves a 40× speedup over SeeSR in inference time while significantly outperforming existing diffusion-prior-based Real-ISR methods.
📝 Abstract
Pre-trained text-to-image diffusion models are increasingly applied to real-world image super-resolution (Real-ISR) task. Given the iterative refinement nature of diffusion models, most existing approaches are computationally expensive. While methods such as SinSR and OSEDiff have emerged to condense inference steps via distillation, their performance in image restoration or details recovery is not satisfied. To address this, we propose TSD-SR, a novel distillation framework specifically designed for real-world image super-resolution, aiming to construct an efficient and effective one-step model. We first introduce the Target Score Distillation, which leverages the priors of diffusion models and real image references to achieve more realistic image restoration. Secondly, we propose a Distribution-Aware Sampling Module to make detail-oriented gradients more readily accessible, addressing the challenge of recovering fine details. Extensive experiments demonstrate that our TSD-SR has superior restoration results (most of the metrics perform the best) and the fastest inference speed (e.g. 40 times faster than SeeSR) compared to the past Real-ISR approaches based on pre-trained diffusion priors.