๐ค AI Summary
Optical music recognition (OMR) of medieval music manuscripts faces severe challenges due to their historical complexity and extreme scarcity of high-quality annotated data. Method: This paper proposes an iterative object detection framework integrating active learning (AL) and sequential learning (SL), built upon YOLOv8. It introduces an uncertainty-driven sample selection strategy augmented with layout-structural priors to improve sampling quality. Evaluation is conducted on a newly constructed manuscript dataset from the Anonymous project. Contribution/Results: With only ~10% of annotated samples, the method achieves 98.2% of the full-supervision baselineโs mAP, substantially alleviating the annotation bottleneck. Furthermore, the study identifies the failure mechanisms of conventional uncertainty metrics on historical manuscripts and proposes domain-adapted improvements. This work establishes a reusable methodological paradigm for low-resource historical document analysis.
๐ Abstract
Optical Music Recognition (OMR) is a cornerstone of music digitization initiatives in cultural heritage, yet it remains limited by the scarcity of annotated data and the complexity of historical manuscripts. In this paper, we present a preliminary study of Active Learning (AL) and Sequential Learning (SL) tailored for object detection and layout recognition in an old medieval music manuscript. Leveraging YOLOv8, our system selects samples with the highest uncertainty (lowest prediction confidence) for iterative labeling and retraining. Our approach starts with a single annotated image and successfully boosts performance while minimizing manual labeling. Experimental results indicate that comparable accuracy to fully supervised training can be achieved with significantly fewer labeled examples. We test the methodology as a preliminary investigation on a novel dataset offered to the community by the Anonymous project, which studies laude, a poetical-musical genre spread across Italy during the 12th-16th Century. We show that in the manuscript at-hand, uncertainty-based AL is not effective and advocates for more usable methods in data-scarcity scenarios.