🤖 AI Summary
This work addresses the poor robustness of existing monocular underwater SLAM methods in complex environments and their inability to generate high-fidelity, photorealistic dense maps. To overcome these limitations, we propose WaterSplat-SLAM, which introduces, for the first time, a semantic medium-aware mechanism into monocular underwater SLAM. Our approach leverages semantic medium-aware filtering within a two-view 3D reconstruction framework to achieve robust camera tracking and depth estimation. Furthermore, by integrating semantic-guided rendering with an online medium-aware Gaussian map representation, the system produces compact yet visually realistic dense reconstructions. Extensive evaluation on multiple underwater datasets demonstrates that WaterSplat-SLAM significantly improves both tracking robustness and the photometric realism of the generated maps.
📝 Abstract
Underwater monocular SLAM is a challenging problem with applications from autonomous underwater vehicles to marine archaeology. However, existing underwater SLAM methods struggle to produce maps with high-fidelity rendering. In this paper, we propose WaterSplat-SLAM, a novel monocular underwater SLAM system that achieves robust pose estimation and photorealistic dense mapping. Specifically, we couple semantic medium filtering into two-view 3D reconstruction prior to enable underwater-adapted camera tracking and depth estimation. Furthermore, we present a semantic-guided rendering and adaptive map management strategy with an online medium-aware Gaussian map, modeling underwater environment in a photorealistic and compact manner. Experiments on multiple underwater datasets demonstrate that WaterSplat-SLAM achieves robust camera tracking and high-fidelity rendering in underwater environments.