One Diffusion Step to Real-World Super-Resolution via Flow Trajectory Distillation

πŸ“… 2025-02-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing single-step diffusion-based Real-ISR methods are constrained by teacher model capacity, often introducing high-frequency artifacts and struggling to balance speed and quality. To address this, we propose Flow Trajectory Distillation (FTD)β€”the first trajectory-level knowledge distillation framework tailored for flow matching models. We design a TV-LPIPS perceptual loss and an Attention Diversification Loss (ADL) regularization term to jointly suppress artifacts and enhance texture fidelity. Leveraging FLUX.1-dev, we construct a lightweight Transformer architecture integrating attention diversity constraints and a composite optimization objective. Evaluated on multiple real-world degradation benchmarks, our method achieves state-of-the-art performance with single-step sampling, attaining over 10Γ— faster inference than prior single-step diffusion approaches. It delivers unified advances in visual quality, computational efficiency, and deployment friendliness.

Technology Category

Application Category

πŸ“ Abstract
Diffusion models (DMs) have significantly advanced the development of real-world image super-resolution (Real-ISR), but the computational cost of multi-step diffusion models limits their application. One-step diffusion models generate high-quality images in a one sampling step, greatly reducing computational overhead and inference latency. However, most existing one-step diffusion methods are constrained by the performance of the teacher model, where poor teacher performance results in image artifacts. To address this limitation, we propose FluxSR, a novel one-step diffusion Real-ISR technique based on flow matching models. We use the state-of-the-art diffusion model FLUX.1-dev as both the teacher model and the base model. First, we introduce Flow Trajectory Distillation (FTD) to distill a multi-step flow matching model into a one-step Real-ISR. Second, to improve image realism and address high-frequency artifact issues in generated images, we propose TV-LPIPS as a perceptual loss and introduce Attention Diversification Loss (ADL) as a regularization term to reduce token similarity in transformer, thereby eliminating high-frequency artifacts. Comprehensive experiments demonstrate that our method outperforms existing one-step diffusion-based Real-ISR methods. The code and model will be released at https://github.com/JianzeLi-114/FluxSR.
Problem

Research questions and friction points this paper is trying to address.

One-step diffusion models for image super-resolution
Reducing computational cost and inference latency
Improving image realism and reducing artifacts
Innovation

Methods, ideas, or system contributions that make the work stand out.

One-step diffusion via Flow Trajectory Distillation
Introduces TV-LPIPS for perceptual loss
Uses Attention Diversification Loss for artifact reduction