Spatio-Temporal Distortion Aware Omnidirectional Video Super-Resolution

📅 2024-10-15
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address spatial distortion induced by spherical projection and temporal flickering across frames in omnidirectional video (ODV) super-resolution, this paper proposes a Spatio-Temporal Distortion-Aware Network (STDAN). Methodologically, we design a distortion modulation module to explicitly model spherical geometric distortions; introduce a multi-frame consistency reconstruction mechanism to achieve accurate spatio-temporal alignment and feature fusion; and formulate a latitude- and saliency-adaptive loss function to prioritize texture recovery in human visual attention regions. Extensive experiments on our newly constructed ODV-SR benchmark demonstrate that STDAN significantly outperforms state-of-the-art methods in both quantitative metrics (PSNR/SSIM) and visual consistency. Notably, it achieves superior detail reconstruction in high-latitude and visually salient regions. This work establishes a new paradigm for high-fidelity panoramic video reconstruction, advancing immersive VR/AR experiences.

Technology Category

Application Category

📝 Abstract
Omnidirectional video (ODV) can provide an immersive experience and is widely utilized in the field of virtual reality and augmented reality. However, the restricted capturing devices and transmission bandwidth lead to the low resolution of ODVs. Video super-resolution (VSR) methods are proposed to enhance the resolution of videos, but ODV projection distortions in the application are not well addressed directly applying such methods. To achieve better super-resolution reconstruction quality, we propose a novel Spatio-Temporal Distortion Aware Network (STDAN) oriented to ODV characteristics. Specifically, a spatio-temporal distortion modulation module is introduced to improve spatial ODV projection distortions and exploit the temporal correlation according to intra and inter alignments. Next, we design a multi-frame reconstruction and fusion mechanism to refine the consistency of reconstructed ODV frames. Furthermore, we incorporate latitude-saliency adaptive maps in the loss function to concentrate on important viewpoint regions with higher texture complexity and human-watching interest. In addition, we collect a new ODV-SR dataset with various scenarios. Extensive experimental results demonstrate that the proposed STDAN achieves superior super-resolution performance on ODVs and outperforms state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Addresses low-resolution omnidirectional video (ODV) issues
Mitigates spatial projection distortions and temporal flickering
Enhances ODV super-resolution with practical viewing strategies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spatially continuous distortion modulation module
Interlaced multi-frame reconstruction mechanism
Latitude-saliency adaptive weights training
🔎 Similar Papers
No similar papers found.
H
Hongyu An
School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, China
Xinfeng Zhang
Xinfeng Zhang
Fuxi AI Lab, NetEase Inc.
Vision-Language ModelsMultimodal
L
Li Zhang
ByteDance Inc., San Diego, CA 92121 USA
Ruiqin Xiong
Ruiqin Xiong
Peking University
video codingimage and video processing