IncompeBench: A Permissively Licensed, Fine-Grained Benchmark for Music Information Retrieval

📅 2026-02-12

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

This work addresses the scarcity of high-quality, fine-grained, and permissively licensed benchmarks in music information retrieval by introducing a new evaluation resource comprising 1,574 high-fidelity music excerpts, 500 diverse natural-language queries, and over 125,000 relevance annotations. Through a multi-stage annotation pipeline combining human expertise and automated validation, the benchmark achieves high inter-annotator consistency. It is the first large-scale dataset released under a permissive Creative Commons Attribution 4.0 (CC BY 4.0) license to support both strict and relaxed evaluation protocols. The authors publicly release two dataset variants alongside standardized query prompts, substantially enhancing the reliability, reproducibility, and fairness of music retrieval model evaluation.

Technology Category

Application Category

📝 Abstract

Multimodal Information Retrieval has made significant progress in recent years, leveraging the increasingly strong multimodal abilities of deep pre-trained models to represent information across modalities. Music Information Retrieval (MIR), in particular, has considerably increased in quality, with neural representations of music even making its way into everyday life products. However, there is a lack of high-quality benchmarks for evaluating music retrieval performance. To address this issue, we introduce \textbf{IncompeBench}, a carefully annotated benchmark comprising $1,574$ permissively licensed, high-quality music snippets, $500$ diverse queries, and over $125,000$ individual relevance judgements. These annotations were created through the use of a multi-stage pipeline, resulting in high agreement between human annotators and the generated data. The resulting datasets are publicly available at https://huggingface.co/datasets/mixedbread-ai/incompebench-strict and https://huggingface.co/datasets/mixedbread-ai/incompebench-lenient with the prompts available at https://github.com/mixedbread-ai/incompebench-programs.

Problem

Research questions and friction points this paper is trying to address.

Music Information Retrieval

Benchmark

Multimodal Information Retrieval

Evaluation Dataset

Innovation

Methods, ideas, or system contributions that make the work stand out.

Music Information Retrieval

Benchmark Dataset

Multimodal Representation

Permissive Licensing

Fine-Grained Annotation

🔎 Similar Papers

Are we there yet? A brief survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges

2024-06-13arXiv.orgCitations: 3