FCA2: Frame Compression-Aware Autoencoder for Modular and Fast Compressed Video Super-Resolution

📅 2025-06-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address key bottlenecks in compressed video super-resolution (CVSR) for high-frame-rate videos—namely, small inter-frame variations, slow inference, complex training, and reliance on auxiliary information—this paper proposes a compression-driven dimensionality reduction paradigm. We design a modular compressive sensing autoencoder architecture that jointly incorporates hyperspectral image prior modeling, compression-domain feature disentanglement, and a lightweight spatiotemporal encoder, enabling auxiliary-free end-to-end training. The method achieves efficient inference with significantly reduced latency, robust temporal modeling, and cross-framework plug-and-play compatibility. Extensive experiments demonstrate that our approach matches or surpasses state-of-the-art performance while exhibiting superior generalization and deployment adaptability. Overall, it establishes a new CVSR paradigm that balances computational efficiency, reconstruction fidelity, and robustness across diverse compression settings.

Technology Category

Application Category

📝 Abstract
State-of-the-art (SOTA) compressed video super-resolution (CVSR) models face persistent challenges, including prolonged inference time, complex training pipelines, and reliance on auxiliary information. As video frame rates continue to increase, the diminishing inter-frame differences further expose the limitations of traditional frame-to-frame information exploitation methods, which are inadequate for addressing current video super-resolution (VSR) demands. To overcome these challenges, we propose an efficient and scalable solution inspired by the structural and statistical similarities between hyperspectral images (HSI) and video data. Our approach introduces a compression-driven dimensionality reduction strategy that reduces computational complexity, accelerates inference, and enhances the extraction of temporal information across frames. The proposed modular architecture is designed for seamless integration with existing VSR frameworks, ensuring strong adaptability and transferability across diverse applications. Experimental results demonstrate that our method achieves performance on par with, or surpassing, the current SOTA models, while significantly reducing inference time. By addressing key bottlenecks in CVSR, our work offers a practical and efficient pathway for advancing VSR technology. Our code will be publicly available at https://github.com/handsomewzy/FCA2.
Problem

Research questions and friction points this paper is trying to address.

Reduce inference time in compressed video super-resolution
Simplify complex training pipelines for CVSR models
Enhance temporal information extraction across video frames
Innovation

Methods, ideas, or system contributions that make the work stand out.

Compression-driven dimensionality reduction strategy
Modular architecture for integration
Enhanced temporal information extraction
🔎 Similar Papers
No similar papers found.
Zhaoyang Wang
Zhaoyang Wang
University of North Carolina at Chapel Hill
NLPLLM AlignmentLLM Reasoning
J
Jie Li
State Key Laboratory of Integrated Services Networks, School of Electronic Engineering, Xidian University, Xi’an, Shaanxi 710071, China
W
Wen Lu
State Key Laboratory of Integrated Services Networks, School of Electronic Engineering, Xidian University, Xi’an, Shaanxi 710071, China
Lihuo He
Lihuo He
Professor, Xidian University
Image/Video Quality AssessmentVisual Perception
M
Maoguo Gong
Key Laboratory of Collaborative Intelligence Systems, Ministry of Education, Xidian University, Xi’an 710071, China, and also affiliated with the College of Mathematical Science, Inner Mongolia Normal University, Hohhot 010028, China
X
Xinbo Gao
School of Electronic Engineering, Xidian University, Xi’an 710071, China, and also with the Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing 400065, China