🤖 AI Summary
Shallow-water optical remote sensing bathymetry (SDB) relies heavily on in-situ depth measurements, while structure-from-motion multi-view stereo (SfM-MVS)–based refractive correction still suffers from depth voids and noise—especially in texture-homogeneous regions. Method: We propose Swim-BathyUNet, a novel end-to-end deep architecture integrating Swin Transformer’s long-range contextual modeling with U-Net’s multi-scale feature extraction, and introducing the first cross-modal cross-attention mechanism to jointly leverage void-contaminated refractively corrected digital surface models (DSMs) and optical imagery—without requiring ground-truth depth supervision. Contribution/Results: The framework supports both DSM void-filling and pure optical SDB modes. Evaluated across heterogeneous Mediterranean and Baltic Sea sites, it achieves a 23.6% improvement in depth prediction accuracy, 98.4% spatial coverage, and 41.2% noise reduction, significantly enhancing fine-detail preservation and void reconstruction capability.
📝 Abstract
Accurate, detailed, and high-frequent bathymetry is crucial for shallow seabed areas facing intense climatological and anthropogenic pressures. Current methods utilizing airborne or satellite optical imagery to derive bathymetry primarily rely on either SfM-MVS with refraction correction or Spectrally Derived Bathymetry (SDB). However, SDB methods often require extensive manual fieldwork or costly reference data, while SfM-MVS approaches face challenges even after refraction correction. These include depth data gaps and noise in environments with homogeneous visual textures, which hinder the creation of accurate and complete Digital Surface Models (DSMs) of the seabed. To address these challenges, this work introduces a methodology that combines the high-fidelity 3D reconstruction capabilities of the SfM-MVS methods with state-of-the-art refraction correction techniques, along with the spectral analysis capabilities of a new deep learning-based method for bathymetry prediction. This integration enables a synergistic approach where SfM-MVS derived DSMs with data gaps are used as training data to generate complete bathymetric maps. In this context, we propose Swin-BathyUNet that combines U-Net with Swin Transformer self-attention layers and a cross-attention mechanism, specifically tailored for SDB. Swin-BathyUNet is designed to improve bathymetric accuracy by capturing long-range spatial relationships and can also function as a standalone solution for standard SDB with various training depth data, independent of the SfM-MVS output. Experimental results in two completely different test sites in the Mediterranean and Baltic Seas demonstrate the effectiveness of the proposed approach through extensive experiments that demonstrate improvements in bathymetric accuracy, detail, coverage, and noise reduction in the predicted DSM. The code is available at https://github.com/pagraf/Swin-BathyUNet.