AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction

📅 2026-01-02

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This work addresses the challenge of monocular dynamic 3D scene reconstruction, where existing Gaussian primitive-based methods often suffer from low-pass filtering, energy instability, and interpolation artifacts, leading to a trade-off between high-frequency detail preservation and temporal coherence. To overcome these limitations, we propose AdaGaR, a unified framework that introduces an adaptive Gabor representation with learnable frequency weights and an energy compensation mechanism to enhance high-frequency modeling. Temporal smoothness is ensured through cubic Hermite spline interpolation combined with time-curvature regularization. Furthermore, we design an adaptive initialization strategy that integrates depth estimation, point tracking, and foreground masks. Evaluated on Tap-Vid DAVIS, AdaGaR achieves state-of-the-art performance (PSNR 35.49, SSIM 0.9433, LPIPS 0.0723) and demonstrates strong generalization across tasks including frame interpolation, depth consistency, video editing, and stereo synthesis.

Technology Category

Application Category

📝 Abstract

Reconstructing dynamic 3D scenes from monocular videos requires simultaneously capturing high-frequency appearance details and temporally continuous motion. Existing methods using single Gaussian primitives are limited by their low-pass filtering nature, while standard Gabor functions introduce energy instability. Moreover, lack of temporal continuity constraints often leads to motion artifacts during interpolation. We propose AdaGaR, a unified framework addressing both frequency adaptivity and temporal continuity in explicit dynamic scene modeling. We introduce Adaptive Gabor Representation, extending Gaussians through learnable frequency weights and adaptive energy compensation to balance detail capture and stability. For temporal continuity, we employ Cubic Hermite Splines with Temporal Curvature Regularization to ensure smooth motion evolution. An Adaptive Initialization mechanism combining depth estimation, point tracking, and foreground masks establishes stable point cloud distributions in early training. Experiments on Tap-Vid DAVIS demonstrate state-of-the-art performance (PSNR 35.49, SSIM 0.9433, LPIPS 0.0723) and strong generalization across frame interpolation, depth consistency, video editing, and stereo view synthesis. Project page: https://jiewenchan.github.io/AdaGaR/

Problem

Research questions and friction points this paper is trying to address.

dynamic scene reconstruction

monocular video

temporal continuity

high-frequency details

motion artifacts

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Gabor Representation

Temporal Continuity

Cubic Hermite Splines