Perch 2.0 transfers'whale'to underwater tasks

📅 2025-12-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the few-shot transferability of the large-scale terrestrial bioacoustic pre-trained model Perch 2.0 to marine mammal and underwater audio classification—despite its training corpus containing virtually no marine species. To address the absence of labeled marine mammal data, we employ linear probing: fine-tuning only the final classification layer using ≤16 labeled underwater audio samples per class. Our experiments constitute the first systematic evaluation of Perch 2.0’s cross-domain, cross-medium (terrestrial → aquatic) generalization capability. Results demonstrate that Perch 2.0 significantly outperforms established bioacoustic models—including PANNs and BirdNET—across multiple marine mammal classification benchmarks. These findings establish Perch 2.0 as a highly effective foundation model for few-shot underwater acoustic classification, enabling high-accuracy marine biodiversity monitoring under extreme label scarcity. The work introduces a novel paradigm for leveraging terrestrial pre-training to advance passive acoustic monitoring in marine ecosystems.

Technology Category

Application Category

📝 Abstract
Perch 2.0 is a supervised bioacoustics foundation model pretrained on 14,597 species, including birds, mammals, amphibians, and insects, and has state-of-the-art performance on multiple benchmarks. Given that Perch 2.0 includes almost no marine mammal audio or classes in the training data, we evaluate Perch 2.0 performance on marine mammal and underwater audio tasks through few-shot transfer learning. We perform linear probing with the embeddings generated from this foundation model and compare performance to other pretrained bioacoustics models. In particular, we compare Perch 2.0 with previous multispecies whale, Perch 1.0, SurfPerch, AVES-bio, BirdAVES, and Birdnet V2.3 models, which have open-source tools for transfer-learning and agile modeling. We show that the embeddings from the Perch 2.0 model have consistently high performance for few-shot transfer learning, generally outperforming alternative embedding models on the majority of tasks, and thus is recommended when developing new linear classifiers for marine mammal classification with few labeled examples.
Problem

Research questions and friction points this paper is trying to address.

Evaluates Perch 2.0's performance on marine mammal audio tasks
Compares Perch 2.0 with other bioacoustics models via transfer learning
Demonstrates Perch 2.0's effectiveness for few-shot marine mammal classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pretrained bioacoustics foundation model with 14,597 species
Few-shot transfer learning for underwater audio tasks
Linear probing with embeddings outperforms alternative models
🔎 Similar Papers
No similar papers found.
A
Andrea Burns
Google DeepMind
L
Lauren Harrell
Google Research
B
B. V. Merrienboer
Google DeepMind
Vincent Dumoulin
Vincent Dumoulin
Research Scientist
Artificial IntelligenceMachine Learning
J
Jenny Hamer
Google DeepMind
T
Tom Denton
Google DeepMind