Knowing when to stop: insights from ecology for building catalogues, collections, and corpora

📅 2025-07-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Assessing the completeness of Gregorian chant catalogues in musicology remains a longstanding challenge. Method: This study pioneers the adaptation of the ecological Chao1 estimator—a nonparametric species richness estimator—into music bibliometrics to quantify repertoire saturation, employing accumulation curve analysis, longitudinal database statistics (1993–2020), and sustained literature indexing validation. Contribution/Results: Findings reveal that current coverage of major chant genres ranges only between 50% and 80%, falling well below the diminishing-returns threshold for catalogue completeness. Approximately 5% of chant items are newly identified annually, confirming ongoing incompleteness. This approach overcomes the limitations of qualitative assessment by providing a reproducible, scalable, and empirically grounded decision-making framework for systematic music heritage documentation.

Technology Category

Application Category

📝 Abstract
A major locus of musicological activity-increasingly in the digital domain-is the cataloguing of sources, which requires large-scale and long-lasting research collaborations. Yet, the databases aiming at covering and representing musical repertoires are never quite complete, and scholars must contend with the question: how much are we still missing? This question structurally resembles the 'unseen species' problem in ecology, where the true number of species must be estimated from limited observations. In this case study, we apply for the first time the common Chao1 estimator to music, specifically to Gregorian chant. We find that, overall, upper bounds for repertoire coverage of the major chant genres range between 50 and 80 %. As expected, we find that Mass Propers are covered better than the Divine Office, though not overwhelmingly so. However, the accumulation curve suggests that those bounds are not tight: a stable ~5% of chants in sources indexed between 1993 and 2020 was new, so diminishing returns in terms of repertoire diversity are not yet to be expected. Our study demonstrates that these questions can be addressed empirically to inform musicological data-gathering, showing the potential of unseen species models in musicology.
Problem

Research questions and friction points this paper is trying to address.

Estimating completeness of musical repertoire databases
Applying ecological unseen species models to musicology
Assessing coverage bounds for Gregorian chant genres
Innovation

Methods, ideas, or system contributions that make the work stand out.

Applying Chao1 estimator to musicology
Estimating Gregorian chant repertoire coverage
Using ecology models for music data-gathering
🔎 Similar Papers
No similar papers found.