🤖 AI Summary
This work addresses the high computational cost of loop closure detection in visual SLAM, which hinders real-time performance. For the first time, it systematically integrates GPU parallel acceleration into the ORB-SLAM3 framework by combining task-level and data-level parallelism and incorporating a CUDA-based GPU-accelerated pose graph optimization module. The proposed approach significantly improves computational efficiency while preserving localization accuracy. Experimental results on the EuRoC and TUM-VI datasets demonstrate that the loop closure module achieves speedups of up to 3.0× on desktop platforms and 2.4× on embedded systems, effectively balancing performance gains with accuracy retention.
📝 Abstract
Visual SLAM systems combine visual tracking with global loop closure to maintain a consistent map and accurate localization. Loop closure is a computationally expensive process as we need to search across the whole map for matches. This paper presents FastLoop, a GPU-accelerated loop closing module to alleviate this computational complexity. We identify key performance bottlenecks in the loop closing pipeline of visual SLAM and address them through parallel optimizations on the GPU. Specifically, we use task-level and data-level parallelism and integrate a GPU-accelerated pose graph optimization. Our implementation is built on top of ORB-SLAM3 and leverages CUDA for GPU programming. Experimental results show that FastLoop achieves an average speedup of 1.4x and 1.3x on the EuRoC dataset and 3.0x and 2.4x on the TUM-VI dataset for the loop closing module on desktop and embedded platforms, respectively, while maintaining the accuracy of the original system.