🤖 AI Summary
In perceptually aliased environments—such as narrow pipelines—conventional loop closure detection suffers from performance degradation due to vector quantization distortion, sparse features, and repetitive textures, while existing solutions incur prohibitive computational overhead. To address these challenges, this paper proposes Bag-of-Word-Groups (BoWG), a novel loop closure detection framework. Its core contributions include: (1) an online visual word-group dictionary that explicitly models spatial co-occurrence patterns; (2) integration of adaptive temporal consistency constraints and feature distribution analysis to enhance discriminability; and (3) a probabilistic transition model coupled with a lightweight post-verification mechanism for efficient similarity evaluation. Evaluated on the Bicocca25b dataset and a newly collected pipeline dataset, BoWG outperforms classical Bag-of-Words and state-of-the-art learning-based methods, achieving a 12.6% improvement in recall while maintaining an average per-frame inference time of only 16 ms—demonstrating superior accuracy, robustness, and real-time capability.
📝 Abstract
Loop closure is critical in Simultaneous Localization and Mapping (SLAM) systems to reduce accumulative drift and ensure global mapping consistency. However, conventional methods struggle in perceptually aliased environments, such as narrow pipes, due to vector quantization, feature sparsity, and repetitive textures, while existing solutions often incur high computational costs. This paper presents Bag-of-Word-Groups (BoWG), a novel loop closure detection method that achieves superior precision-recall, robustness, and computational efficiency. The core innovation lies in the introduction of word groups, which captures the spatial co-occurrence and proximity of visual words to construct an online dictionary. Additionally, drawing inspiration from probabilistic transition models, we incorporate temporal consistency directly into similarity computation with an adaptive scheme, substantially improving precision-recall performance. The method is further strengthened by a feature distribution analysis module and dedicated post-verification mechanisms. To evaluate the effectiveness of our method, we conduct experiments on both public datasets and a confined-pipe dataset we constructed. Results demonstrate that BoWG surpasses state-of-the-art methods, including both traditional and learning-based approaches, in terms of precision-recall and computational efficiency. Our approach also exhibits excellent scalability, achieving an average processing time of 16 ms per image across 17,565 images in the Bicocca25b dataset.