🤖 AI Summary
To address inefficient mapping and degraded novel-view synthesis quality in monocular SLAM—caused by accumulated drift and redundant point clouds in indoor scenes—this paper proposes a subgraph-based Gaussian fusion SLAM framework. Methodologically, it constructs locally geometrically consistent subgraphs via lightweight modules and unifies their integration using a globally differentiable Gaussian representation; drift is suppressed and tracking robustness enhanced through cross-view geometric consistency optimization and bidirectional pose-map feedback. Contributions include the first joint optimization of local reconstruction accuracy and global geometric consistency. Evaluated on real-world datasets, the method achieves over 12% improvement in tracking accuracy, yields finer-grained 3D reconstructions, reduces memory consumption by 37%, and enables high-fidelity real-time rendering. This provides an efficient, compact, and geometrically reliable solution for dense RGB mapping and neural rendering.
📝 Abstract
Recent advances in dense 3D reconstruction enable the accurate capture of local geometry; however, integrating them into SLAM is challenging due to drift and redundant point maps, which limit efficiency and downstream tasks, such as novel view synthesis. To address these issues, we propose SING3R-SLAM, a globally consistent and compact Gaussian-based dense RGB SLAM framework. The key idea is to combine locally consistent 3D reconstructions with a unified global Gaussian representation that jointly refines scene geometry and camera poses, enabling efficient and versatile 3D mapping for multiple downstream applications. SING3R-SLAM first builds locally consistent submaps through our lightweight tracking and reconstruction module, and then progressively aligns and fuses them into a global Gaussian map that enforces cross-view geometric consistency. This global map, in turn, provides feedback to correct local drift and enhance the robustness of tracking. Extensive experiments demonstrate that SING3R-SLAM achieves state-of-the-art tracking, 3D reconstruction, and novel view rendering, resulting in over 12% improvement in tracking and producing finer, more detailed geometry, all while maintaining a compact and memory-efficient global representation on real-world datasets.