🤖 AI Summary
This work proposes Dream-SLAM, a novel active SLAM framework that addresses key limitations of existing approaches—namely, dependence on underlying SLAM performance, short-sighted exploration policies, and inadequate modeling of dynamic scenes. Dream-SLAM uniquely integrates cross-temporal image generation and semantic structure prediction to “imagine” plausible geometry and semantics in unobserved regions, which are then fused with real observations to enhance pose estimation accuracy and 3D scene consistency. By synergistically combining generative modeling, monocular visual SLAM, and long-horizon path planning, the method enables robust, long-term exploration even in dynamic environments. Extensive experiments on both public and self-collected datasets demonstrate that Dream-SLAM significantly outperforms state-of-the-art methods in terms of localization accuracy, mapping quality, and exploration efficiency.
📝 Abstract
In addition to the core tasks of simultaneous localization and mapping (SLAM), active SLAM additionally in- volves generating robot actions that enable effective and efficient exploration of unknown environments. However, existing active SLAM pipelines are limited by three main factors. First, they inherit the restrictions of the underlying SLAM modules that they may be using. Second, their motion planning strategies are typically shortsighted and lack long-term vision. Third, most approaches struggle to handle dynamic scenes. To address these limitations, we propose a novel monocular active SLAM method, Dream-SLAM, which is based on dreaming cross-spatio-temporal images and semantically plausible structures of partially observed dynamic environments. The generated cross-spatio-temporal im- ages are fused with real observations to mitigate noise and data incompleteness, leading to more accurate camera pose estimation and a more coherent 3D scene representation. Furthermore, we integrate dreamed and observed scene structures to enable long- horizon planning, producing farsighted trajectories that promote efficient and thorough exploration. Extensive experiments on both public and self-collected datasets demonstrate that Dream-SLAM outperforms state-of-the-art methods in localization accuracy, mapping quality, and exploration efficiency. Source code will be publicly available upon paper acceptance.