🤖 AI Summary
To address voxel map inconsistency in underwater visual SLAM caused by sensor noise and dynamic scenes, this paper proposes an uncertainty-aware voxel mapping framework. Methodologically, we enhance Voxblox’s weight update scheme by explicitly incorporating per-pixel depth confidence scores from RAFT-Stereo into the voxel fusion process—marking the first integration of such stereo-derived uncertainty estimates for online modeling and spatial visualization of depth estimation uncertainty. Evaluations in confined pool environments and the real-world Trondheim Fjord harbor demonstrate that the framework accurately captures the spatiotemporal evolution of uncertainty in complex underwater settings, significantly improving geometric consistency and environmental representation fidelity of reconstructed maps. The core contribution is a confidence-driven voxel update paradigm, establishing an interpretable and robust foundation for underwater visual navigation.
📝 Abstract
Vision-based underwater robots can be useful in inspecting and exploring confined spaces where traditional sensors and preplanned paths cannot be followed. Sensor noise and situational change can cause significant uncertainty in environmental representation. Thus, this paper explores how to represent mapping inconsistency in vision-based sensing and incorporate depth estimation confidence into the mapping framework. The scene depth and the confidence are estimated using the RAFT-Stereo model and are integrated into a voxel-based mapping framework, Voxblox. Improvements in the existing Voxblox weight calculation and update mechanism are also proposed. Finally, a qualitative analysis of the proposed method is performed in a confined pool and in a pier in the Trondheim fjord. Experiments using an underwater robot demonstrated the change in uncertainty in the visualization.