🤖 AI Summary
To address severe pose drift, frequent inter-floor false loop closures, and low computational efficiency in multi-floor SLAM, this paper proposes a semantic-enhanced SLAM framework built upon a four-tier situational graph (keyframe/wall/room/floor). We introduce, for the first time, a floor-level semantic loop closure mechanism that explicitly models inter-floor semantic constraints to suppress false matches. A hierarchical graph optimization strategy—progressing from local → room-local → floor-global—is designed to jointly improve accuracy and efficiency. By leveraging semantic-driven floor detection, hierarchical association, and constraint embedding, the method significantly enhances mapping accuracy, real-time performance, and robustness in large-scale multi-floor environments. Extensive experiments demonstrate state-of-the-art performance across multiple metrics, with controllable computational complexity.
📝 Abstract
Works based on localization and mapping do not exploit the inherent semantic-relational information from the environment for faster and efficient management and optimization of the robot poses and its map elements, often leading to pose and map inaccuracies and computational inefficiencies in large scale environments. 3D scene graph representations which distributes the environment in an hierarchical manner can be exploited to enhance the management/optimization of underlying robot poses and its map. In this direction, we present our work Situational Graphs 2.0, which leverages the hierarchical structure of indoor scenes for efficient data management and optimization. Our algorithm begins by constructing a situational graph that organizes the environment into four layers: Keyframes, Walls, Rooms, and Floors. Our first novelty lies in the front-end which includes a floor detection module capable of identifying stairways and assigning a floor-level semantic-relations to the underlying layers. This floor-level semantic enables a floor-based loop closure strategy, rejecting false-positive loop closures in visually similar areas on different floors. Our second novelty is in exploiting the hierarchy for an improved optimization. It consists of: (1) local optimization, optimizing a window of recent keyframes and their connected components, (2) floor-global optimization, which focuses only on keyframes and their connections within the current floor during loop closures, and (3) room-local optimization, marginalizing redundant keyframes that share observations within the room. We validate our algorithm extensively in different real multi-floor environments. Our approach can demonstrate state-of-art-art results in large scale multi-floor environments creating hierarchical maps while bounding the computational complexity where several baseline works fail to execute efficiently.