Biodenoising: animal vocalization denoising without access to clean data

📅 2024-10-04
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of vocalization denoising in wildlife audio recordings—exacerbated by the scarcity of clean, annotated vocalization samples—this paper introduces the first weakly supervised bioacoustic denoising paradigm that requires no ground-truth clean vocalizations. Our method leverages speech enhancement models (e.g., Demucs and CleanUNet) to generate pseudo-clean vocalization labels and integrates silent-noise segments (i.e., noise-only segments without vocalizations) to construct training data. The framework is designed for robustness across species, environments, and geographic regions. Key contributions include: (1) establishing the first non-overlapping, cross-taxa, standardized bioacoustic denoising benchmark; (2) releasing an open-source, integrated toolkit encompassing data, code, and evaluation metrics; and (3) achieving performance on par with fully supervised methods on cross-species benchmarks. This work substantially reduces both the data dependency and technical barriers for wildlife acoustic analysis.

Technology Category

Application Category

📝 Abstract
Animal vocalization denoising is a task similar to human speech enhancement, which is relatively well-studied. In contrast to the latter, it comprises a higher diversity of sound production mechanisms and recording environments, and this higher diversity is a challenge for existing models. Adding to the challenge and in contrast to speech, we lack large and diverse datasets comprising clean vocalizations. As a solution we use as training data pseudo-clean targets, i.e. pre-denoised vocalizations, and segments of background noise without a vocalization. We propose a train set derived from bioacoustics datasets and repositories representing diverse species, acoustic environments, geographic regions. Additionally, we introduce a non-overlapping benchmark set comprising clean vocalizations from different taxa and noise samples. We show that that denoising models (demucs, CleanUNet) trained on pseudo-clean targets obtained with speech enhancement models achieve competitive results on the benchmarking set. We publish data, code, libraries, and demos at https://mariusmiron.com/research/biodenoising.
Problem

Research questions and friction points this paper is trying to address.

Bioacoustics
Background Noise Reduction
Signal Quality Enhancement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Biological Denoising
Animal Vocalization Cleanup
Machine Learning Models
🔎 Similar Papers
No similar papers found.