🤖 AI Summary
To address the poor robustness of feature extraction and matching in turbid underwater environments—caused by light attenuation, scattering, and marine snow noise—this paper proposes a novel solution integrating adaptive GAN-based synthesis and cross-modal knowledge distillation. Specifically, we jointly model water optical properties, forward scattering, and the spatial-intensity distribution of marine snow to synthesize high-fidelity underwater images. Additionally, we design a general-purpose knowledge distillation framework to transfer features learned by aerially pre-trained models into the underwater domain. Evaluated on real underwater sequences, our method achieves a 32.7% improvement in feature matching success rate and reduces VSLAM trajectory error by 28.5%, significantly enhancing visual localization and mapping performance under challenging underwater conditions.
📝 Abstract
Autonomous Underwater Vehicles (AUVs) play a crucial role in underwater exploration. Vision-based methods offer cost-effective solutions for localization and mapping in the absence of conventional sensors like GPS and LIDAR. However, underwater environments present significant challenges for feature extraction and matching due to image blurring and noise caused by attenuation, scattering, and the interference of extit{marine snow}. In this paper, we aim to improve the robustness of the feature extraction and matching in the turbid underwater environment using the cross-modal knowledge distillation method that transfers the in-air feature extraction models to underwater settings using synthetic underwater images as the medium. We first propose a novel adaptive GAN-synthesis method to estimate water parameters and underwater noise distribution, to generate environment-specific synthetic underwater images. We then introduce a general knowledge distillation framework compatible with different teacher models. The evaluation of GAN-based synthesis highlights the significance of the new components, i.e. GAN-synthesized noise and forward scattering, in the proposed model. Additionally, the downstream application of feature extraction and matching (VSLAM) on real underwater sequences validates the effectiveness of the transferred model.