GRS-SLAM3R: Real-Time Dense SLAM with Gated Recurrent State

📅 2025-09-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing end-to-end DUSt3R-style methods estimate local point clouds solely from image pairs, lacking spatial memory and global consistency modeling, thus failing to support incremental, globally consistent metric reconstruction. Method: We propose the first end-to-end dense SLAM framework based on gated recurrent states: a latent state serves as spatial memory, while a Transformer-driven gated update module enables sequential state evolution; combined with subgraph partitioning, local relative geometric constraint modeling, and global registration optimization, the framework ensures cross-frame geometric consistency. The method operates without scene priors or camera calibration, directly producing globally consistent, metrically accurate dense point clouds from RGB sequences in real time. Contribution/Results: Our approach achieves significantly higher reconstruction accuracy than state-of-the-art methods across multiple standard benchmarks, while maintaining real-time performance.

Technology Category

Application Category

📝 Abstract
DUSt3R-based end-to-end scene reconstruction has recently shown promising results in dense visual SLAM. However, most existing methods only use image pairs to estimate pointmaps, overlooking spatial memory and global consistency.To this end, we introduce GRS-SLAM3R, an end-to-end SLAM framework for dense scene reconstruction and pose estimation from RGB images without any prior knowledge of the scene or camera parameters. Unlike existing DUSt3R-based frameworks, which operate on all image pairs and predict per-pair point maps in local coordinate frames, our method supports sequentialized input and incrementally estimates metric-scale point clouds in the global coordinate. In order to improve consistent spatial correlation, we use a latent state for spatial memory and design a transformer-based gated update module to reset and update the spatial memory that continuously aggregates and tracks relevant 3D information across frames. Furthermore, we partition the scene into submaps, apply local alignment within each submap, and register all submaps into a common world frame using relative constraints, producing a globally consistent map. Experiments on various datasets show that our framework achieves superior reconstruction accuracy while maintaining real-time performance.
Problem

Research questions and friction points this paper is trying to address.

Achieving globally consistent dense 3D reconstruction from sequential RGB images
Improving spatial memory and consistency in end-to-end SLAM systems
Maintaining real-time performance while enhancing reconstruction accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sequential input processing for global metric-scale point clouds
Transformer-based gated module for spatial memory management
Submap partitioning with local alignment for global consistency
🔎 Similar Papers
No similar papers found.
G
Guole Shen
Institute of Medical Robotics, School of Automation and Intelligent Sensing, Shanghai Jiao Tong University, Shanghai 200240, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240
Tianchen Deng
Tianchen Deng
Shanghai Jiao Tong University
RoboticsComputer Vision
Y
Yanbo Wang
Institute of Medical Robotics, School of Automation and Intelligent Sensing, Shanghai Jiao Tong University, Shanghai 200240, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240
Y
Yongtao Chen
Institute of Medical Robotics, School of Automation and Intelligent Sensing, Shanghai Jiao Tong University, Shanghai 200240, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240
Yilin Shen
Yilin Shen
AI Research Scientist
LLMMultimodal AIAgentOn-device AI
Jiuming Liu
Jiuming Liu
Shanghai Jiao Tong University
Computer visionRoboticsMachine learningAutonomous drivingAIGC
J
Jingchuan Wang
Institute of Medical Robotics, School of Automation and Intelligent Sensing, Shanghai Jiao Tong University, Shanghai 200240, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240