Perch 2.0: The Bittern Lesson for Bioacoustics

📅 2025-08-06

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

To address insufficient cross-species (beyond birds) sound classification and transfer learning capabilities in bioacoustics, this paper introduces the first large-scale self-distillation pretraining framework targeting multiple taxonomic groups. Methodologically, it innovatively incorporates fine-grained species classification into the self-distillation pipeline, jointly optimizing a prototype-based classifier and a source-prediction objective to enhance supervised signal quality and learn domain-invariant representations. The proposed model achieves state-of-the-art performance on the BirdSet and BEANS benchmarks and significantly outperforms specialized models in zero-shot and few-shot marine bioacoustic transfer tasks—demonstrating strong generalization and cross-domain adaptability. Key contributions include: (i) establishing the first multi-species self-distillation paradigm; (ii) proposing a prototype-source joint learning mechanism; and (iii) enabling efficient bioacoustic representation transfer without requiring any target-domain data.

Technology Category

Application Category

📝 Abstract

Perch is a performant pre-trained model for bioacoustics. It was trained in supervised fashion, providing both off-the-shelf classification scores for thousands of vocalizing species as well as strong embeddings for transfer learning. In this new release, Perch 2.0, we expand from training exclusively on avian species to a large multi-taxa dataset. The model is trained with self-distillation using a prototype-learning classifier as well as a new source-prediction training criterion. Perch 2.0 obtains state-of-the-art performance on the BirdSet and BEANS benchmarks. It also outperforms specialized marine models on marine transfer learning tasks, despite having almost no marine training data. We present hypotheses as to why fine-grained species classification is a particularly robust pre-training task for bioacoustics.

Problem

Research questions and friction points this paper is trying to address.

Expands bioacoustic model to multi-taxa from avian-only training

Improves species classification via self-distillation and prototype-learning

Achieves state-of-the-art performance on BirdSet and BEANS benchmarks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-distillation with prototype-learning classifier

Multi-taxa dataset for expanded training

Source-prediction training criterion

🔎 Similar Papers

No similar papers found.