π€ AI Summary
Existing single-step diffusion-based Real-ISR methods are constrained by teacher model capacity, often introducing high-frequency artifacts and struggling to balance speed and quality. To address this, we propose Flow Trajectory Distillation (FTD)βthe first trajectory-level knowledge distillation framework tailored for flow matching models. We design a TV-LPIPS perceptual loss and an Attention Diversification Loss (ADL) regularization term to jointly suppress artifacts and enhance texture fidelity. Leveraging FLUX.1-dev, we construct a lightweight Transformer architecture integrating attention diversity constraints and a composite optimization objective. Evaluated on multiple real-world degradation benchmarks, our method achieves state-of-the-art performance with single-step sampling, attaining over 10Γ faster inference than prior single-step diffusion approaches. It delivers unified advances in visual quality, computational efficiency, and deployment friendliness.
π Abstract
Diffusion models (DMs) have significantly advanced the development of real-world image super-resolution (Real-ISR), but the computational cost of multi-step diffusion models limits their application. One-step diffusion models generate high-quality images in a one sampling step, greatly reducing computational overhead and inference latency. However, most existing one-step diffusion methods are constrained by the performance of the teacher model, where poor teacher performance results in image artifacts. To address this limitation, we propose FluxSR, a novel one-step diffusion Real-ISR technique based on flow matching models. We use the state-of-the-art diffusion model FLUX.1-dev as both the teacher model and the base model. First, we introduce Flow Trajectory Distillation (FTD) to distill a multi-step flow matching model into a one-step Real-ISR. Second, to improve image realism and address high-frequency artifact issues in generated images, we propose TV-LPIPS as a perceptual loss and introduce Attention Diversification Loss (ADL) as a regularization term to reduce token similarity in transformer, thereby eliminating high-frequency artifacts. Comprehensive experiments demonstrate that our method outperforms existing one-step diffusion-based Real-ISR methods. The code and model will be released at https://github.com/JianzeLi-114/FluxSR.