Blind Video Super-Resolution based on Implicit Kernels

📅 2025-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing blind video super-resolution (BVSR) methods often assume spatially invariant blur kernels and neglect spatiotemporal degradation variations, limiting reconstruction fidelity. To address this, we propose a novel framework that explicitly models non-uniform spatiotemporal degradation. Our key contributions are: (1) the first use of implicit neural representations to construct a multi-scale kernel dictionary, enabling dynamic, frame-wise modeling of spatially varying blur kernels; and (2) a recurrent Transformer architecture that jointly performs inter-frame correction and feature alignment via adaptive filtering. This enables more accurate blind degradation estimation and reconstruction optimization under unknown degradation conditions. Extensive experiments demonstrate state-of-the-art performance on three benchmark datasets—REDS, Vimeo-90K, and BSDS—outperforming prior methods including FMA-Net, with up to 0.59 dB PSNR gain. The source code is publicly available.

Technology Category

Application Category

📝 Abstract
Blind video super-resolution (BVSR) is a low-level vision task which aims to generate high-resolution videos from low-resolution counterparts in unknown degradation scenarios. Existing approaches typically predict blur kernels that are spatially invariant in each video frame or even the entire video. These methods do not consider potential spatio-temporal varying degradations in videos, resulting in suboptimal BVSR performance. In this context, we propose a novel BVSR model based on Implicit Kernels, BVSR-IK, which constructs a multi-scale kernel dictionary parameterized by implicit neural representations. It also employs a newly designed recurrent Transformer to predict the coefficient weights for accurate filtering in both frame correction and feature alignment. Experimental results have demonstrated the effectiveness of the proposed BVSR-IK, when compared with four state-of-the-art BVSR models on three commonly used datasets, with BVSR-IK outperforming the second best approach, FMA-Net, by up to 0.59 dB in PSNR. Source code will be available at https://github.com.
Problem

Research questions and friction points this paper is trying to address.

Enhance low-resolution videos in unknown degradation scenarios.
Address spatio-temporal varying degradations in video frames.
Improve performance over existing blind video super-resolution methods.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Implicit neural representations for kernel dictionary
Recurrent Transformer for coefficient prediction
Multi-scale kernel dictionary for BVSR
🔎 Similar Papers
No similar papers found.
Q
Qiang Zhu
University of Electronic Science and Technology of China
Y
Yuxuan Jiang
University of Bristol
Shuyuan Zhu
Shuyuan Zhu
Associate Professor of University of Electronic Science and Technology of China
Signa ProcessingImage/Video Compression
F
Fan Zhang
University of Bristol
D
David R. Bull
University of Bristol
Bing Zeng
Bing Zeng
University of Electronic Science and Technology of China
Image and video processing