Time-Aware One Step Diffusion Network for Real-World Image Super-Resolution

📅 2025-08-22

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

In real-world image super-resolution (Real-ISR), conventional one-step variational score distillation (VSD) struggles to leverage the multi-step generative priors encoded in pre-trained stable diffusion models. To address this, we propose Time-Aware Single-step Diffusion (TASD), a novel framework that adaptively aligns latent representations across diverse noise timesteps via a time-aware VAE encoder and dynamic latent-space alignment. TASD further introduces a time-aware score distillation loss and conditional timestep guidance, enabling, for the first time, effective integration of multi-step generative priors within a single-step inference pipeline. Crucially, it supports explicit, continuous trade-off control between fidelity and perceptual realism. Extensive experiments demonstrate state-of-the-art performance on multiple Real-ISR benchmarks, achieving high-quality super-resolved outputs in just one diffusion step.

Technology Category

Application Category

📝 Abstract

Diffusion-based real-world image super-resolution (Real-ISR) methods have demonstrated impressive performance. To achieve efficient Real-ISR, many works employ Variational Score Distillation (VSD) to distill pre-trained stable-diffusion (SD) model for one-step SR with a fixed timestep. However, due to the different noise injection timesteps, the SD will perform different generative priors. Therefore, a fixed timestep is difficult for these methods to fully leverage the generative priors in SD, leading to suboptimal performance. To address this, we propose a Time-Aware one-step Diffusion Network for Real-ISR (TADSR). We first introduce a Time-Aware VAE Encoder, which projects the same image into different latent features based on timesteps. Through joint dynamic variation of timesteps and latent features, the student model can better align with the input pattern distribution of the pre-trained SD, thereby enabling more effective utilization of SD's generative capabilities. To better activate the generative prior of SD at different timesteps, we propose a Time-Aware VSD loss that bridges the timesteps of the student model and those of the teacher model, thereby producing more consistent generative prior guidance conditioned on timesteps. Additionally, though utilizing the generative prior in SD at different timesteps, our method can naturally achieve controllable trade-offs between fidelity and realism by changing the timestep condition. Experimental results demonstrate that our method achieves both state-of-the-art performance and controllable SR results with only a single step.

Problem

Research questions and friction points this paper is trying to address.

Optimizing timestep utilization in diffusion-based super-resolution

Enhancing generative prior alignment across different noise levels

Achieving fidelity-realism tradeoff control in single-step SR

Innovation

Methods, ideas, or system contributions that make the work stand out.

Time-Aware VAE Encoder for dynamic latent features

Time-Aware VSD loss aligning student-teacher timesteps

Single-step diffusion enabling controllable fidelity-realism tradeoff

🔎 Similar Papers

TDDSR: Single-Step Diffusion with Two Discriminators for Super Resolution