GS-STVSR: Ultra-Efficient Continuous Spatio-Temporal Video Super-Resolution via 2D Gaussian Splatting

📅 2026-04-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

204K/year
🤖 AI Summary
This work addresses the computational inefficiency in continuous space-time video super-resolution caused by dense pixel queries. The authors propose an efficient framework based on 2D Gaussian splatting that supports arbitrary spatial and temporal upscaling factors. By modeling motion implicitly to drive the spatiotemporal evolution of Gaussian kernels, the method circumvents conventional grid-based sampling. Key innovations include a lightweight intermediate fitting strategy, an optical flow-guided motion module, a covariance-aware resampling alignment mechanism, and an adaptive offset windowing scheme, collectively enhancing both efficiency and robustness. The approach achieves state-of-the-art performance on Vid4, GoPro, and Adobe240 benchmarks, with near-constant inference time across standard scales and over 3× speedup at extreme upscaling factors (e.g., ×32).

Technology Category

Application Category

📝 Abstract
Continuous Spatio-Temporal Video Super-Resolution (C-STVSR) aims to simultaneously enhance the spatial resolution and frame rate of videos by arbitrary scale factors, offering greater flexibility than fixed-scale methods that are constrained by predefined upsampling ratios. In recent years, methods based on Implicit Neural Representations (INR) have made significant progress in C-STVSR by learning continuous mappings from spatio-temporal coordinates to pixel values. However, these methods fundamentally rely on dense pixel-wise grid queries, causing computational cost to scale linearly with the number of interpolated frames and severely limiting inference efficiency. We propose GS-STVSR, an ultra-efficient C-STVSR framework based on 2D Gaussian Splatting (2D-GS) that drives the spatiotemporal evolution of Gaussian kernels through continuous motion modeling, bypassing dense grid queries entirely. We exploit the strong temporal stability of covariance parameters for lightweight intermediate fitting, design an optical flow-guided motion module to derive Gaussian position and color at arbitrary time steps, introduce a Covariance resampling alignment module to prevent covariance drift, and propose an adaptive offset window for large-scale motion. Extensive experiments on Vid4, GoPro, and Adobe240 show that GS-STVSR achieves state-of-the-art quality across all benchmarks. Moreover, its inference time remains nearly constant at conventional temporal scales (X2--X8) and delivers over X3 speedup at extreme scales X32, demonstrating strong practical applicability.
Problem

Research questions and friction points this paper is trying to address.

Continuous Spatio-Temporal Video Super-Resolution
Implicit Neural Representations
Computational Efficiency
Arbitrary-Scale Upsampling
Dense Grid Queries
Innovation

Methods, ideas, or system contributions that make the work stand out.

2D Gaussian Splatting
Continuous Spatio-Temporal Video Super-Resolution
Implicit Neural Representations
Optical Flow-Guided Motion
Covariance Alignment
🔎 Similar Papers
No similar papers found.
M
Mingyu Shi
University of Science and Technology of China
Xin Di
Xin Di
University of Science and Technology of China
Computer visionLow level visionSuper-resolutionComputer vision\Low level vision\Super-resolution
Long Peng
Long Peng
China Electric Power Research Institute
LCC-HVDC and VSC-HVDC Transmission Technologies
B
Boxiang Cao
University of Science and Technology of China
A
Anran Wu
University of Science and Technology of China
Z
Zhanfeng Feng
University of Science and Technology of China
Jiaming Guo
Jiaming Guo
Institute of Computing Technology, Chinese Academy of Sciences
Artificial intelligenceReinforcement Learning
R
Renjing Pei
Huawei Noah’s Ark Lab
X
Xueyang Fu
University of Science and Technology of China
Yang Cao
Yang Cao
University of Science and Technology of China
computer visionimage processing
Z
Zhengjun Zha
University of Science and Technology of China