🤖 AI Summary
This study addresses the scarcity of temporally localized bird vocalization data in tropical soundscapes, which hinders the application of supervised learning to biodiversity monitoring. To bridge this gap, the authors introduce and release PteroSet—the first large-scale, strongly annotated acoustic dataset for tropical birds—comprising 563 audio recordings (73.62 hours) collected from two sites in Colombia, with 15,372 time-frequency annotations spanning 168 bird species. Annotations and audio metadata are uniformly structured in COCO-style JSON format. The work characterizes, for the first time, patterns of sound co-occurrence and cross-site domain shifts in tropical soundscapes, provides deep learning baselines, and establishes a realistic evaluation benchmark. Experiments validate the dataset’s utility for avian detection tasks while highlighting the technical challenges posed by complex acoustic environments.
📝 Abstract
Passive acoustic monitoring enables continuous, non-invasive biodiversity assessment across diverse ecosystems. The scale of these datasets has driven the adoption of machine learning, with supervised approaches showing strong performance. However, supervised methods require time-resolved annotated datasets, which remain scarce, especially in complex tropical soundscapes. We present PteroSet, a curated dataset of strongly annotated Neotropical bird vocalizations recorded in Puerto Asis (Putumayo) and Pivijay (Magdalena), Colombia, between 2023 and 2025. The dataset comprises 563 recordings (73.62 h) and 15,372 time-frequency annotations, including 6,702 events identified to the species level across 168 species. We release the annotations in a COCO-inspired JSON schema that unifies audio files, taxonomic categories, and labels for machine learning workflows. Beyond providing annotated data, PteroSet serves as a realistic benchmark that highlights key characteristics of tropical soundscapes, including acoustic co-occurrence and domain shift across recording sites. We provide a deep learning baseline for binary bird detection, demonstrating PteroSet's usability and the challenges it presents.