MOSAIC-GS: Monocular Scene Reconstruction via Advanced Initialization for Complex Dynamic Environments

📅 2026-01-08
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of monocular dynamic scene reconstruction, where the absence of multi-view constraints makes it difficult to simultaneously achieve high geometric accuracy and temporal consistency. The authors propose an explicit initialization strategy that integrates multiple geometric cues—including depth, optical flow, motion segmentation, and point tracking—with rigid motion constraints to effectively disentangle static backgrounds from dynamic objects. Non-rigid motion trajectories are efficiently modeled using time-aware Poly-Fourier curves, significantly reducing reliance on visual appearance. The method achieves reconstruction quality comparable to state-of-the-art approaches on standard dynamic scene benchmarks, while offering faster optimization and real-time rendering performance.

Technology Category

Application Category

📝 Abstract
We present MOSAIC-GS, a novel, fully explicit, and computationally efficient approach for high-fidelity dynamic scene reconstruction from monocular videos using Gaussian Splatting. Monocular reconstruction is inherently ill-posed due to the lack of sufficient multiview constraints, making accurate recovery of object geometry and temporal coherence particularly challenging. To address this, we leverage multiple geometric cues, such as depth, optical flow, dynamic object segmentation, and point tracking. Combined with rigidity-based motion constraints, these cues allow us to estimate preliminary 3D scene dynamics during an initialization stage. Recovering scene dynamics prior to the photometric optimization reduces reliance on motion inference from visual appearance alone, which is often ambiguous in monocular settings. To enable compact representations, fast training, and real-time rendering while supporting non-rigid deformations, the scene is decomposed into static and dynamic components. Each Gaussian in the dynamic part of the scene is assigned a trajectory represented as time-dependent Poly-Fourier curve for parameter-efficient motion encoding. We demonstrate that MOSAIC-GS achieves substantially faster optimization and rendering compared to existing methods, while maintaining reconstruction quality on par with state-of-the-art approaches across standard monocular dynamic scene benchmarks.
Problem

Research questions and friction points this paper is trying to address.

monocular reconstruction
dynamic scene
3D reconstruction
temporal coherence
ill-posed problem
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian Splatting
Monocular Dynamic Reconstruction
Advanced Initialization
Poly-Fourier Trajectory
Explicit Scene Representation
🔎 Similar Papers
No similar papers found.