OFTSR: One-Step Flow for Image Super-Resolution with Tunable Fidelity-Realism Trade-offs

📅 2024-12-12

🏛️ arXiv.org

📈 Citations: 7

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Existing diffusion- and flow-based super-resolution models rely either on computationally expensive multi-step sampling or distilled approaches with fixed fidelity–realism trade-offs, lacking continuous controllability. This paper proposes OFTSR—the first streaming super-resolution framework enabling single-step inference with continuous, interactive adjustment of the fidelity–realism trade-off. Its core innovation is an ODE-trajectory-aligned single-step distillation paradigm, wherein the student model directly outputs a sample that precisely matches an intermediate state of the teacher model in one forward pass. This is achieved through conditional flow matching modeling, ODE-trajectory-constrained distillation, and teacher–student flow consistency optimization. OFTSR achieves state-of-the-art performance for single-step super-resolution on FFHQ, DIV2K, and ImageNet, accelerating inference by over 100× compared to iterative baselines, while supporting real-time, slider-based fidelity–realism control.

Technology Category

Application Category

📝 Abstract

Recent advances in diffusion and flow-based generative models have demonstrated remarkable success in image restoration tasks, achieving superior perceptual quality compared to traditional deep learning approaches. However, these methods either require numerous sampling steps to generate high-quality images, resulting in significant computational overhead, or rely on model distillation, which usually imposes a fixed fidelity-realism trade-off and thus lacks flexibility. In this paper, we introduce OFTSR, a novel flow-based framework for one-step image super-resolution that can produce outputs with tunable levels of fidelity and realism. Our approach first trains a conditional flow-based super-resolution model to serve as a teacher model. We then distill this teacher model by applying a specialized constraint. Specifically, we force the predictions from our one-step student model for same input to lie on the same sampling ODE trajectory of the teacher model. This alignment ensures that the student model's single-step predictions from initial states match the teacher's predictions from a closer intermediate state. Through extensive experiments on challenging datasets including FFHQ (256$ imes$256), DIV2K, and ImageNet (256$ imes$256), we demonstrate that OFTSR achieves state-of-the-art performance for one-step image super-resolution, while having the ability to flexibly tune the fidelity-realism trade-off. Code and pre-trained models are available at https://github.com/yuanzhi-zhu/OFTSR and https://huggingface.co/Yuanzhi/OFTSR, respectively.

Problem

Research questions and friction points this paper is trying to address.

One-step image super-resolution with tunable fidelity-realism trade-offs

Eliminates multiple sampling steps to reduce computational overhead

Overcomes fixed trade-off limitations in existing distillation methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

One-step flow-based super-resolution framework

Teacher-student distillation with ODE trajectory alignment

Tunable fidelity-realism trade-off capability

🔎 Similar Papers

IG-CFAT: An Improved GAN-Based Framework for Effectively Exploiting Transformers in Real-World Image Super-Resolution