ProDyG: Progressive Dynamic Scene Reconstruction via Gaussian Splatting from Monocular Videos

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Existing dynamic 3D reconstruction methods suffer from limitations in online operation, global map consistency, appearance detail modeling, and compatibility with multimodal inputs (RGB/RGB-D): SLAM systems often neglect non-rigid motion or rely on RGB-D data; offline approaches scale poorly to long sequences; and feed-forward Transformer-based models lack geometric consistency and rendering fidelity. This paper proposes the first online monocular dynamic 3D reconstruction framework, integrating lightweight SLAM with Gaussian splatting rendering. Robust pose tracking is achieved via motion masking, while a progressive motion skeleton graph explicitly models non-rigid object deformation. Our method achieves, for the first time in an online system, novel-view synthesis quality comparable to state-of-the-art offline methods (PSNR +2.1 dB), attains SOTA pose accuracy among dynamic SLAM approaches, and simultaneously ensures global map consistency and real-time performance.

Technology Category

Application Category

📝 Abstract

Achieving truly practical dynamic 3D reconstruction requires online operation, global pose and map consistency, detailed appearance modeling, and the flexibility to handle both RGB and RGB-D inputs. However, existing SLAM methods typically merely remove the dynamic parts or require RGB-D input, while offline methods are not scalable to long video sequences, and current transformer-based feedforward methods lack global consistency and appearance details. To this end, we achieve online dynamic scene reconstruction by disentangling the static and dynamic parts within a SLAM system. The poses are tracked robustly with a novel motion masking strategy, and dynamic parts are reconstructed leveraging a progressive adaptation of a Motion Scaffolds graph. Our method yields novel view renderings competitive to offline methods and achieves on-par tracking with state-of-the-art dynamic SLAM methods.

Problem

Research questions and friction points this paper is trying to address.

Achieving practical online dynamic 3D reconstruction from monocular videos

Maintaining global pose consistency and detailed appearance modeling

Handling both static and dynamic scene components simultaneously

Innovation

Methods, ideas, or system contributions that make the work stand out.

Motion masking strategy for robust pose tracking

Progressive adaptation of Motion Scaffolds graph

Disentangling static and dynamic parts in SLAM

🔎 Similar Papers

Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Dynamic Scenes