Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction

📅 2025-04-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional SLAM methods, relying on the static-scene assumption, struggle to achieve both accurate camera pose estimation and complete 3D reconstruction in highly dynamic videos: existing approaches either discard dynamic regions or model motion separately, leading to reconstruction incompleteness and motion inconsistency. This paper proposes BA-Track, the first framework that jointly optimizes a learned 3D point tracker with classical bundle adjustment (BA). A motion decomposition mechanism explicitly decouples camera ego-motion from dynamic object motion, enabling BA to safely operate on all scene points. Furthermore, a lightweight depth-consistency post-processing module, guided by a scale map, ensures pose accuracy, dense reconstruction completeness, and metric-scale consistency within a unified pipeline. Evaluated on challenging dynamic datasets, BA-Track reduces average rotation and translation errors by 38% and 41%, respectively, achieving high-fidelity, temporally coherent, and scale-consistent joint reconstruction of static and dynamic elements.

Technology Category

Application Category

📝 Abstract
Traditional SLAM systems, which rely on bundle adjustment, struggle with highly dynamic scenes commonly found in casual videos. Such videos entangle the motion of dynamic elements, undermining the assumption of static environments required by traditional systems. Existing techniques either filter out dynamic elements or model their motion independently. However, the former often results in incomplete reconstructions, whereas the latter can lead to inconsistent motion estimates. Taking a novel approach, this work leverages a 3D point tracker to separate the camera-induced motion from the observed motion of dynamic objects. By considering only the camera-induced component, bundle adjustment can operate reliably on all scene elements as a result. We further ensure depth consistency across video frames with lightweight post-processing based on scale maps. Our framework combines the core of traditional SLAM -- bundle adjustment -- with a robust learning-based 3D tracker front-end. Integrating motion decomposition, bundle adjustment and depth refinement, our unified framework, BA-Track, accurately tracks the camera motion and produces temporally coherent and scale-consistent dense reconstructions, accommodating both static and dynamic elements. Our experiments on challenging datasets reveal significant improvements in camera pose estimation and 3D reconstruction accuracy.
Problem

Research questions and friction points this paper is trying to address.

Handling dynamic elements in SLAM systems
Separating camera and object motion accurately
Ensuring consistent depth across video frames
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses 3D point tracker for motion separation
Integrates bundle adjustment with learning-based tracker
Ensures depth consistency via lightweight scale maps
🔎 Similar Papers
2024-08-29arXiv.orgCitations: 30