SpatioTemporal Difference Network for Video Depth Super-Resolution

📅 2025-08-02

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

To address the long-tailed distribution of spatially non-smooth and temporally dynamic regions in video depth super-resolution, this paper proposes the Spatio-Temporal Difference Network (STDNet). STDNet features a dual-branch architecture: a spatial difference branch dynamically aligns RGB and depth features within each frame for intra-frame fine-grained calibration; a temporal difference branch models inter-frame motion dynamics and prioritizes propagation of temporal discrepancy information to mitigate temporal long-tail effects. By jointly optimizing multi-frame RGB and depth data, STDNet achieves enhanced spatial detail reconstruction while preserving temporal consistency. Extensive experiments on multiple benchmark datasets demonstrate that STDNet outperforms state-of-the-art methods in both PSNR and SSIM metrics, with particularly notable improvements in long-tail scenarios—such as object boundaries and motion-prone regions—where spatial fidelity and temporal coherence are most challenging to maintain.

Technology Category

Application Category

📝 Abstract

Depth super-resolution has achieved impressive performance, and the incorporation of multi-frame information further enhances reconstruction quality. Nevertheless, statistical analyses reveal that video depth super-resolution remains affected by pronounced long-tailed distributions, with the long-tailed effects primarily manifesting in spatial non-smooth regions and temporal variation zones. To address these challenges, we propose a novel SpatioTemporal Difference Network (STDNet) comprising two core branches: a spatial difference branch and a temporal difference branch. In the spatial difference branch, we introduce a spatial difference mechanism to mitigate the long-tailed issues in spatial non-smooth regions. This mechanism dynamically aligns RGB features with learned spatial difference representations, enabling intra-frame RGB-D aggregation for depth calibration. In the temporal difference branch, we further design a temporal difference strategy that preferentially propagates temporal variation information from adjacent RGB and depth frames to the current depth frame, leveraging temporal difference representations to achieve precise motion compensation in temporal long-tailed areas. Extensive experimental results across multiple datasets demonstrate the effectiveness of our STDNet, outperforming existing approaches.

Problem

Research questions and friction points this paper is trying to address.

Addresses long-tailed distributions in video depth super-resolution

Improves depth quality in spatial non-smooth regions

Enhances temporal motion compensation in variation zones

Innovation

Methods, ideas, or system contributions that make the work stand out.

Spatial difference mechanism for non-smooth regions

Temporal difference strategy for motion compensation

RGB-D aggregation for depth calibration

🔎 Similar Papers

No similar papers found.