Score-Agnostic Structure Analysis in Large-Scale Performance Datasets

📅 2026-05-25

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

Existing datasets of automatically transcribed piano performances often lack ground-truth scores, and structural inconsistencies—arising from repeats or interpretive variants across different renditions of the same piece—hinder effective grouping and comparison. This work proposes a score-free structural analysis method that extracts features indicative of structural divergence, such as alignment costs and performance durations, via sequence-to-sequence alignment. These features are then leveraged in hierarchical clustering to group performances with consistent structural interpretations. By shifting the evaluation criterion from ground-truth accuracy to musical coherence and plausibility, the approach demonstrates both effectiveness and scalability on a large-scale dataset comprising approximately 1,500 performances across 88 distinct works.

📝 Abstract

In recent years, thanks to advances in automatic music transcription (AMT), several large-scale datasets of automatically transcribed piano solo music have been released. While these datasets undoubtedly offer extensive material for performance studies, they vary substantially in quality. In the case of classical music, performances often differ not only in expressive aspects such as tempo, but also in their structural interpretation of the score (including repeat patterns and edition-specific variants). To meaningfully use large-scale transcribed datasets for performance research, transcriptions of the same piece must be grouped according to their underlying structural realisation to support valid comparison. We address this by applying sequence-to-sequence alignment followed by hierarchical clustering: we create pairwise alignments for all pairs of transcriptions of a given piece, and use the alignment cost and (dis)similarity of performed sequence lengths to resolve structural mismatches as features for grouping. We propose this approach as a first step towards automatically evaluating large-scale transcribed datasets that lack ground-truth score and/or audio, shifting the evaluation criterion from truth-based accuracy to musical coherence and plausibility. We demonstrate our score-agnostic approach on around 1,500 transcriptions of 88 compositions from a recently published large-scale transcribed piano performance dataset.

Problem

Research questions and friction points this paper is trying to address.

score-agnostic

structure analysis

performance datasets

structural interpretation

music transcription

Innovation

Methods, ideas, or system contributions that make the work stand out.

score-agnostic

sequence-to-sequence alignment

hierarchical clustering