Dream-SLAM: Dreaming the Unseen for Active SLAM in Dynamic Environments

📅 2026-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes Dream-SLAM, a novel active SLAM framework that addresses key limitations of existing approaches—namely, dependence on underlying SLAM performance, short-sighted exploration policies, and inadequate modeling of dynamic scenes. Dream-SLAM uniquely integrates cross-temporal image generation and semantic structure prediction to “imagine” plausible geometry and semantics in unobserved regions, which are then fused with real observations to enhance pose estimation accuracy and 3D scene consistency. By synergistically combining generative modeling, monocular visual SLAM, and long-horizon path planning, the method enables robust, long-term exploration even in dynamic environments. Extensive experiments on both public and self-collected datasets demonstrate that Dream-SLAM significantly outperforms state-of-the-art methods in terms of localization accuracy, mapping quality, and exploration efficiency.

Technology Category

Application Category

📝 Abstract
In addition to the core tasks of simultaneous localization and mapping (SLAM), active SLAM additionally in- volves generating robot actions that enable effective and efficient exploration of unknown environments. However, existing active SLAM pipelines are limited by three main factors. First, they inherit the restrictions of the underlying SLAM modules that they may be using. Second, their motion planning strategies are typically shortsighted and lack long-term vision. Third, most approaches struggle to handle dynamic scenes. To address these limitations, we propose a novel monocular active SLAM method, Dream-SLAM, which is based on dreaming cross-spatio-temporal images and semantically plausible structures of partially observed dynamic environments. The generated cross-spatio-temporal im- ages are fused with real observations to mitigate noise and data incompleteness, leading to more accurate camera pose estimation and a more coherent 3D scene representation. Furthermore, we integrate dreamed and observed scene structures to enable long- horizon planning, producing farsighted trajectories that promote efficient and thorough exploration. Extensive experiments on both public and self-collected datasets demonstrate that Dream-SLAM outperforms state-of-the-art methods in localization accuracy, mapping quality, and exploration efficiency. Source code will be publicly available upon paper acceptance.
Problem

Research questions and friction points this paper is trying to address.

Active SLAM
dynamic environments
motion planning
monocular SLAM
scene representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Active SLAM
Cross-spatio-temporal dreaming
Dynamic environments
Long-horizon planning
Monocular vision
🔎 Similar Papers
No similar papers found.
X
Xiangqi Meng
Thrust of Robotics and Autonomous Systems, Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
P
Pengxu Hou
Thrust of Robotics and Autonomous Systems, Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
Zhenjun Zhao
Zhenjun Zhao
Universidad de Zaragoza
Computer vision3D visionRobotics
Javier Civera
Javier Civera
I3A, Universidad de Zaragoza, Spain
Computer VisionRoboticsSLAMVisual SLAM
Daniel Cremers
Daniel Cremers
Technical University of Munich
Computer VisionMachine LearningOptimizationRobotics
H
Hesheng Wang
School of Automation and Intelligent Sensing, Shanghai Jiao Tong University, Shanghai 200240, China
Haoang Li
Haoang Li
Assistant Professor, Hong Kong University of Science and Technology (Guangzhou)
Robotics3D Computer Vision