Prior-Enhanced Gaussian Splatting for Dynamic Scene Reconstruction from Casual Video

📅 2025-12-12

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

To address core challenges in monocular RGB video dynamic scene reconstruction—including distortion of thin structures, depth inconsistency, floating artifacts, and motion-geometry incoherence—this paper proposes an object-aware Gaussian rasterization framework. Methodologically, it introduces three novel components: (1) a mask-guided object-level depth loss, (2) skeleton-based sampling with mask-driven re-identification, and (3) virtual-view depth supervision coupled with scaffold projection modeling, explicitly enforcing consistency between 3D motion nodes and 2D trajectories while suppressing floating objects. The technical pipeline integrates video segmentation, epipolar error map optimization, and multi-source geometric supervision. Evaluated on standard benchmarks, our approach comprehensively outperforms state-of-the-art methods, achieving significant improvements in geometric accuracy, motion coherence, and texture fidelity. To our knowledge, it is the first method enabling fully automatic, high-quality dynamic scene reconstruction from arbitrarily captured monocular videos.

Technology Category

Application Category

📝 Abstract

We introduce a fully automatic pipeline for dynamic scene reconstruction from casually captured monocular RGB videos. Rather than designing a new scene representation, we enhance the priors that drive Dynamic Gaussian Splatting. Video segmentation combined with epipolar-error maps yields object-level masks that closely follow thin structures; these masks (i) guide an object-depth loss that sharpens the consistent video depth, and (ii) support skeleton-based sampling plus mask-guided re-identification to produce reliable, comprehensive 2-D tracks. Two additional objectives embed the refined priors in the reconstruction stage: a virtual-view depth loss removes floaters, and a scaffold-projection loss ties motion nodes to the tracks, preserving fine geometry and coherent motion. The resulting system surpasses previous monocular dynamic scene reconstruction methods and delivers visibly superior renderings

Problem

Research questions and friction points this paper is trying to address.

Reconstructs dynamic scenes from casual monocular videos

Enhances priors to improve object masks and depth estimation

Reduces floaters and preserves geometry with additional loss functions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhancing priors for Dynamic Gaussian Splatting

Using object-level masks for depth and tracking

Applying virtual-view and scaffold-projection losses

🔎 Similar Papers

Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Dynamic Scenes