🤖 AI Summary
This work addresses the challenge of limited labeled data hindering supervised learning in Indian art music analysis by proposing a graph-based semi-supervised approach. Specifically, it constructs a similarity graph from audio embeddings and employs transductive label propagation to diffuse information from a small set of annotated samples to a large pool of unlabeled data, enabling effective raga identification and instrument classification. To the best of our knowledge, this is the first study to introduce graph-based label propagation into Indian music analysis. By integrating a multi-source data fusion strategy, the proposed method significantly outperforms conventional baselines, generates high-quality pseudo-labels, and substantially reduces reliance on costly expert annotations while maintaining strong performance.
📝 Abstract
Supervised machine learning frameworks rely on extensive labeled datasets for robust performance on real-world tasks. However, there is a lack of large annotated datasets in audio and music domains, as annotating such recordings is resource-intensive, laborious, and often require expert domain knowledge. In this work, we explore the use of label propagation (LP), a graph-based semi-supervised learning technique, for automatically labeling the unlabeled set in an unsupervised manner. By constructing a similarity graph over audio embeddings, we propagate limited label information from a small annotated subset to a larger unlabeled corpus in a transductive, semi-supervised setting. We apply this method to two tasks in Indian Art Music (IAM): Raga identification and Instrument classification. For both these tasks, we integrate multiple public datasets along with additional recordings we acquire from Prasar Bharati Archives to perform LP. Our experiments demonstrate that LP significantly reduces labeling overhead and produces higher-quality annotations compared to conventional baseline methods, including those based on pretrained inductive models. These results highlight the potential of graph-based semi-supervised learning to democratize data annotation and accelerate progress in music information retrieval.