🤖 AI Summary
Existing 3D Gaussian Splatting (3DGS)-based SLAM methods are largely confined to indoor environments and rely on active depth sensors. This work presents the first binocular RGB-only 3DGS SLAM system tailored for large-scale outdoor urban scenes. Methodologically, it pioneers the integration of 3D Gaussian point-based modeling into a stereo-vision SLAM framework; leverages a pre-trained stereo matching network to generate depth priors; and jointly optimizes camera poses, Gaussian parameters, and map geometry via differentiable Gaussian rendering and incremental map refinement. A multi-objective loss function—jointly enforcing geometric consistency and photometric fidelity—is introduced to stabilize optimization. Evaluated on multiple outdoor benchmarks, the system significantly outperforms existing 3DGS-based SLAM baselines, demonstrating superior robustness and generalization in both tracking accuracy and mapping completeness.
📝 Abstract
3D Gaussian Splatting (3DGS) has recently gained popularity in SLAM applications due to its fast rendering and high-fidelity representation. However, existing 3DGS-SLAM systems have predominantly focused on indoor environments and relied on active depth sensors, leaving a gap for large-scale outdoor applications. We present BGS-SLAM, the first binocular 3D Gaussian Splatting SLAM system designed for outdoor scenarios. Our approach uses only RGB stereo pairs without requiring LiDAR or active sensors. BGS-SLAM leverages depth estimates from pre-trained deep stereo networks to guide 3D Gaussian optimization with a multi-loss strategy enhancing both geometric consistency and visual quality. Experiments on multiple datasets demonstrate that BGS-SLAM achieves superior tracking accuracy and mapping performance compared to other 3DGS-based solutions in complex outdoor environments.