🤖 AI Summary
To address the degraded robustness and accuracy of SLAM in underwater environments due to severe visual degradation, this paper proposes the first tightly coupled stereo camera–IMU–imaging sonar trimalodal underwater SLAM system. Our method integrates sonar image feature matching, IMU preintegration, and nonlinear optimization via iSAM2/GTSAM. Key contributions include: (1) a visual-degradation-aware adaptive dimensionality reduction mechanism that dynamically switches to sonar–inertial 3-DoF localization; (2) leveraging sonar-derived pose estimates as strong priors to enhance IMU propagation stability; and (3) sonar-driven cold-start initialization and online sonar–visual extrinsic calibration. Evaluated on synthetic, pool, and near-shore real-world datasets, the system significantly outperforms state-of-the-art visual-inertial SLAM methods, achieving centimeter-level pose accuracy and high mapping success rates even in low-texture and turbid underwater conditions.
📝 Abstract
Visual degradation in underwater environments poses unique and significant challenges, which distinguishes underwater SLAM from popular vision-based SLAM on the ground. In this paper, we propose RUSSO, a robust underwater SLAM system which fuses stereo camera, inertial measurement unit (IMU), and imaging sonar to achieve robust and accurate localization in challenging underwater environments for 6 degrees of freedom (DoF) estimation. During visual degradation, the system is reduced to a sonar-inertial system estimating 3-DoF poses. The sonar pose estimation serves as a strong prior for IMU propagation, thereby enhancing the reliability of pose estimation with IMU propagation. Additionally, we propose a SLAM initialization method that leverages the imaging sonar to counteract the lack of visual features during the initialization stage of SLAM. We extensively validate RUSSO through experiments in simulator, pool, and sea scenarios. The results demonstrate that RUSSO achieves better robustness and localization accuracy compared to the state-of-the-art visual-inertial SLAM systems, especially in visually challenging scenarios. To the best of our knowledge, this is the first time fusing stereo camera, IMU, and imaging sonar to realize robust underwater SLAM against visual degradation.