๐ค AI Summary
Traditional deterministic methods for multi-frame low-resolution (LR) image super-resolution (SR) often suffer from texture blurring and reconstruction artifacts. To address this, we propose an efficient video SR framework based on diffusion models. Our approach integrates burst sequence modeling, knowledge distillation, and a high-order ordinary differential equation (ODE) stochastic sampler to enable single-step, high-fidelity diffusion-based SR. Specifically, a teacherโstudent architecture distills the long-sequence diffusion process into a compact student model, drastically reducing inference steps. Experiments demonstrate that our method achieves only 1.6% of the runtime of baseline approaches while outperforming state-of-the-art methods in PSNR, LPIPS, and user studies. It preserves fine-grained structural details and significantly improves generation efficiency, establishing a new paradigm for real-time multi-frame SR.
๐ Abstract
While burst Low-Resolution (LR) images are useful for improving their Super Resolution (SR) image compared to a single LR image, prior burst SR methods are trained in a deterministic manner, which produces a blurry SR image. Since such blurry images are perceptually degraded, we aim to reconstruct sharp and high-fidelity SR images by a diffusion model. Our method improves the efficiency of the diffusion model with a stochastic sampler with a high-order ODE as well as one-step diffusion using knowledge distillation. Our experimental results demonstrate that our method can reduce the runtime to 1.6 % of its baseline while maintaining the SR quality measured based on image distortion and perceptual quality.