A Multimodal Data Fusion Generative Adversarial Network for Real Time Underwater Sound Speed Field Construction

📅 2025-07-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of high-precision sound speed profile (SSP) reconstruction in the absence of in-situ underwater SSP measurements. We propose MDF-RAGAN, a deep learning model that fuses heterogeneous remote-sensing data—including sea surface temperature—to reconstruct SSPs end-to-end without reliance on shipborne sonar observations. The model incorporates a residual attention mechanism to capture subtle sound speed perturbations and employs cross-modal attention to enable adaptive integration of multi-source information. Evaluated on a public benchmark dataset, MDF-RAGAN achieves a root-mean-square error (RMSE) of 0.3 m/s—nearly doubling the accuracy of conventional CNN-based and spatial interpolation methods, and reducing error by 65.8% relative to the mean SSP baseline. The approach significantly enhances global spatial modeling capability and generalization across diverse oceanic regions.

Technology Category

Application Category

📝 Abstract
Sound speed profiles (SSPs) are essential parameters underwater that affects the propagation mode of underwater signals and has a critical impact on the energy efficiency of underwater acoustic communication and accuracy of underwater acoustic positioning. Traditionally, SSPs can be obtained by matching field processing (MFP), compressive sensing (CS), and deep learning (DL) methods. However, existing methods mainly rely on on-site underwater sonar observation data, which put forward strict requirements on the deployment of sonar observation systems. To achieve high-precision estimation of sound velocity distribution in a given sea area without on-site underwater data measurement, we propose a multi-modal data-fusion generative adversarial network model with residual attention block (MDF-RAGAN) for SSP construction. To improve the model's ability for capturing global spatial feature correlations, we embedded the attention mechanisms, and use residual modules for deeply capturing small disturbances in the deep ocean sound velocity distribution caused by changes of SST. Experimental results on real open dataset show that the proposed model outperforms other state-of-the-art methods, which achieves an accuracy with an error of less than 0.3m/s. Specifically, MDF-RAGAN not only outperforms convolutional neural network (CNN) and spatial interpolation (SITP) by nearly a factor of two, but also achieves about 65.8% root mean square error (RMSE) reduction compared to mean profile, which fully reflects the enhancement of overall profile matching by multi-source fusion and cross-modal attention.
Problem

Research questions and friction points this paper is trying to address.

Estimating underwater sound speed without on-site data measurement
Improving accuracy of sound velocity distribution using multi-modal fusion
Capturing global spatial features for SSP construction via attention mechanisms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal data fusion GAN for SSP construction
Residual attention blocks enhance feature capture
Cross-modal attention improves spatial correlation accuracy
🔎 Similar Papers
No similar papers found.
W
Wei Huang
Y
Yuqiang Huang
Y
Yanan Wu
T
Tianhe Xu
Junting Wang
Junting Wang
University of Illinois, Urbana-Champaign
Graph Neural NetworkDeep LearningData miningRecommender System
H
Hao Zhang