The Computation of Generalized Embeddings for Underwater Acoustic Target Recognition using Contrastive Learning

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Addressing the scarcity of high-quality labeled data and poor cross-domain generalization in underwater acoustic target recognition, this paper proposes an unsupervised contrastive representation learning method. We introduce the first integration of the Conformer architecture with Variance-Invariance-Covariance Regularization (VICR) loss for self-supervised pretraining on large-scale publicly available, low-quality, unlabeled underwater acoustic data. This yields robust, discriminative, and noise-resilient universal acoustic embeddings. Subsequently, lightweight supervised fine-tuning adapts the pretrained model to downstream tasks. Evaluated on two cross-domain tasks—vessel type classification and marine mammal vocalization categorization—the method achieves significant improvements in classification accuracy and generalization performance. Results demonstrate both the effectiveness and transferability of unsupervised pretraining for underwater acoustic analysis.

Technology Category

Application Category

📝 Abstract

The increasing level of sound pollution in marine environments poses an increased threat to ocean health, making it crucial to monitor underwater noise. By monitoring this noise, the sources responsible for this pollution can be mapped. Monitoring is performed by passively listening to these sounds. This generates a large amount of data records, capturing a mix of sound sources such as ship activities and marine mammal vocalizations. Although machine learning offers a promising solution for automatic sound classification, current state-of-the-art methods implement supervised learning. This requires a large amount of high-quality labeled data that is not publicly available. In contrast, a massive amount of lower-quality unlabeled data is publicly available, offering the opportunity to explore unsupervised learning techniques. This research explores this possibility by implementing an unsupervised Contrastive Learning approach. Here, a Conformer-based encoder is optimized by the so-called Variance-Invariance-Covariance Regularization loss function on these lower-quality unlabeled data and the translation to the labeled data is made. Through classification tasks involving recognizing ship types and marine mammal vocalizations, our method demonstrates to produce robust and generalized embeddings. This shows to potential of unsupervised methods for various automatic underwater acoustic analysis tasks.

Problem

Research questions and friction points this paper is trying to address.

Classify underwater sounds without labeled data

Map pollution sources using unlabeled acoustic data

Improve recognition of ships and marine mammals

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised Contrastive Learning for embeddings

Conformer-based encoder optimization

Variance-Invariance-Covariance Regularization loss

🔎 Similar Papers

No similar papers found.