MGSO: Monocular Real-time Photometric SLAM with Efficient 3D Gaussian Splatting

๐Ÿ“… 2024-09-19
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing monocular 3D Gaussian Splatting (3DGS)-based SLAM methods struggle to simultaneously achieve real-time performance, high mapping accuracy, and memory efficiency on resource-constrained devices, primarily due to the lack of robust 3D Gaussian initialization and joint optimization under monocular conditions. This paper proposes the first photometric SLAM-guided monocular real-time dense SLAM framework. We introduce a novel structured point cloudโ€“assisted Gaussian initialization mechanism driven by photometric consistency constraints, significantly reducing redundant Gaussians. The framework integrates differentiable 3DGS rendering, monocular visual odometry, and online map optimization, requiring only RGB input. Evaluated on Replica, TUM-RGBD, and EuRoC benchmarks, it surpasses state-of-the-art methods, achieving >30 FPS inference speed, 40% lower memory footprint, and stable deployment on commodity laptops.

Technology Category

Application Category

๐Ÿ“ Abstract
Real-time SLAM with dense 3D mapping is computationally challenging, especially on resource-limited devices. The recent development of 3D Gaussian Splatting (3DGS) offers a promising approach for real-time dense 3D reconstruction. However, existing 3DGS-based SLAM systems struggle to balance hardware simplicity, speed, and map quality. Most systems excel in one or two of the aforementioned aspects but rarely achieve all. A key issue is the difficulty of initializing 3D Gaussians while concurrently conducting SLAM. To address these challenges, we present Monocular GSO (MGSO), a novel real-time SLAM system that integrates photometric SLAM with 3DGS. Photometric SLAM provides dense structured point clouds for 3DGS initialization, accelerating optimization and producing more efficient maps with fewer Gaussians. As a result, experiments show that our system generates reconstructions with a balance of quality, memory efficiency, and speed that outperforms the state-of-the-art. Furthermore, our system achieves all results using RGB inputs. We evaluate the Replica, TUM-RGBD, and EuRoC datasets against current live dense reconstruction systems. Not only do we surpass contemporary systems, but experiments also show that we maintain our performance on laptop hardware, making it a practical solution for robotics, A/R, and other real-time applications.
Problem

Research questions and friction points this paper is trying to address.

Balancing hardware simplicity, speed, and map quality in 3DGS-based SLAM
Initializing 3D Gaussians efficiently during concurrent SLAM operations
Achieving real-time dense 3D reconstruction with monocular RGB inputs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates photometric SLAM with 3D Gaussian Splatting
Uses dense structured point clouds for initialization
Achieves balance of quality, memory, and speed
๐Ÿ”Ž Similar Papers
No similar papers found.
Y
Yan Song Hu
Vision and Image Processing Lab at the Faculty of System Design Engineering, at the University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada
N
Nicolas Abboud
Vision and Image Processing Lab at the Faculty of System Design Engineering, at the University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada
N
Nicolas Abboud
Vision and Robotics Lab, Maroun Semaan Faculty of Engineering and Architecture, American University of Beirut, 1107 2020, Riad El Solh, Beirut, Lebanon
M
Muhammad Qasim Ali
Vision and Image Processing Lab at the Faculty of System Design Engineering, at the University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada
A
Adam Srebrnjak Yang
Vision and Image Processing Lab at the Faculty of System Design Engineering, at the University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada
Imad H. Elhajj
Imad H. Elhajj
Vision and Robotics Lab, Maroun Semaan Faculty of Engineering and Architecture, American University of Beirut, 1107 2020, Riad El Solh, Beirut, Lebanon
Daniel Asmar
Daniel Asmar
Vision and Robotics Lab, Maroun Semaan Faculty of Engineering and Architecture, American University of Beirut, 1107 2020, Riad El Solh, Beirut, Lebanon
Y
Yuhao Chen
Vision and Image Processing Lab at the Faculty of System Design Engineering, at the University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada
John S. Zelek
John S. Zelek
Professor, Systems Design Engineering, University of Waterloo
computer visionrobotics