BioDCASE 2026 Challenge Baseline for Cross-Domain Mosquito Species Classification

πŸ“… 2026-03-20
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the challenges of audio-based mosquito species classification in real-world environments, where low signal-to-noise ratios, background interference, class imbalance, and domain shifts across recording devices or settings hinder model generalization. To advance research in this area, the authors establish the first official benchmark for cross-domain mosquito species classification, explicitly distinguishing performance on seen versus unseen domains. They propose a reproducible baseline system that leverages Log-mel spectrogram features and a Multi-Time-Resolution Convolutional Neural Network (MTRCNN), enhanced with domain-adversarial auxiliary learning to jointly predict species and domain labels. Experimental results demonstrate strong performance on seen domains but a significant drop in accuracy on unseen domains, highlighting cross-domain generalization as a critical barrier to real-world deployment and providing a solid foundation for future work.

Technology Category

Application Category

πŸ“ Abstract
Mosquito-borne diseases affect more than one billion people each year and cause close to one million deaths. Traditional surveillance methods rely on traps and manual identification that are slow, labor-intensive, and difficult to scale. Audio-based mosquito monitoring offers a non-destructive, lower-cost, and more scalable complement to trap-based surveillance, but reliable species classification remains difficult under real-world recording conditions. Mosquito flight tones are narrow-band, often low in signal-to-noise ratio, and easily masked by background noise, and recordings for several epidemiologically relevant species remain limited, creating pronounced class imbalance. Variation across devices, environments, and collection protocols further increases the difficulty of robust classification. Such variation can cause models to rely on domain-specific recording artefacts rather than species-relevant acoustic cues, which makes transfer to new acquisition settings difficult. The BioDCASE 2026 Cross-Domain Mosquito Species Classification (CD-MSC) challenge is designed around this deployment problem by evaluating performance on both seen and unseen domains. This paper presents the official baseline system and evaluation pipeline as a simple, fully reproducible reference for the CD-MSC challenge task. The baseline uses log-mel features and a multitemporal resolution convolutional neural network (MTRCNN) with species and auxiliary domain outputs, together with complete training and test scripts. The baseline system performs strongly on seen domains but degrades markedly on unseen domains, showing that cross-domain generalisation, rather than within-domain recognition, is the central challenge for practical mosquito species classification from multi-source bioacoustic recordings.
Problem

Research questions and friction points this paper is trying to address.

cross-domain generalization
mosquito species classification
bioacoustic monitoring
class imbalance
domain shift
Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-domain generalization
multitemporal resolution CNN
log-mel features
domain adaptation
mosquito species classification
πŸ”Ž Similar Papers
No similar papers found.
Y
Yuanbo Hou
Machine Learning Research Group, University of Oxford
V
Vanja Zdravkovic
Machine Learning Research Group, University of Oxford
M
Marianne Sinka
Oxford Long-Term Ecology Laboratory, University of Oxford
Yunpeng Li
Yunpeng Li
Reader in AI & Digital Oral Health, King's College London
Machine LearningSignal ProcessingHealthcare
Wenwu Wang
Wenwu Wang
Professor, University of Surrey, UK
signal processingmachine learningmachine listeningaudio/speech/audio-visualmultimodal fusion
M
Mark D. Plumbley
King’s College London
K
Kathy Willis
Oxford Long-Term Ecology Laboratory, University of Oxford
Stephen Roberts
Stephen Roberts
Professor of Engineering Science (Machine Learning, Information Engineering), University of Oxford
Machine LearningBayesian InferenceComplex SystemsFinanceAstrostatistics