Dynamic Visual SLAM using a General 3D Prior

📅 2025-12-07

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

Dynamic objects in natural environments degrade monocular SLAM performance by inducing pose estimation drift and map reconstruction artifacts. To address this, we propose a robust monocular SLAM framework integrating feedforward deep reconstruction with geometric patch-based online bundle adjustment. Our approach introduces a lightweight feedforward network for real-time dynamic region segmentation and removal, while leveraging predicted depth to mitigate monocular scale ambiguity. Furthermore, we design a depth-geometric patch alignment mechanism that explicitly enforces static structural consistency during optimization. This framework preserves the simplicity of monocular systems while significantly suppressing dynamic object interference in both trajectory estimation and mapping. Experimental evaluation on multiple dynamic-scene datasets demonstrates average pose accuracy improvements of 23%–41% over state-of-the-art methods, alongside superior reconstruction completeness and system stability.

Technology Category

Application Category

📝 Abstract

Reliable incremental estimation of camera poses and 3D reconstruction is key to enable various applications including robotics, interactive visualization, and augmented reality. However, this task is particularly challenging in dynamic natural environments, where scene dynamics can severely deteriorate camera pose estimation accuracy. In this work, we propose a novel monocular visual SLAM system that can robustly estimate camera poses in dynamic scenes. To this end, we leverage the complementary strengths of geometric patch-based online bundle adjustment and recent feed-forward reconstruction models. Specifically, we propose a feed-forward reconstruction model to precisely filter out dynamic regions, while also utilizing its depth prediction to enhance the robustness of the patch-based visual SLAM. By aligning depth prediction with estimated patches from bundle adjustment, we robustly handle the inherent scale ambiguities of the batch-wise application of the feed-forward reconstruction model.

Problem

Research questions and friction points this paper is trying to address.

Robust camera pose estimation in dynamic scenes

Filtering dynamic regions using feed-forward reconstruction

Handling scale ambiguities in monocular visual SLAM

Innovation

Methods, ideas, or system contributions that make the work stand out.

Monocular SLAM uses feed-forward model for dynamic filtering

Depth prediction aligns with bundle adjustment patches

Combines geometric and reconstruction models for robustness

🔎 Similar Papers

No similar papers found.