GAN-based Content-Conditioned Generation of Handwritten Musical Symbols

📅 2025-10-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Severe scarcity of annotated handwritten historical music scores critically limits optical music recognition (OMR) performance. Method: We propose a content-conditioned generative adversarial network (cGAN) that enables, for the first time, semantically controllable synthesis of handwritten musical symbols. The model takes structured score information—including note class, duration, and spatial position—as conditional inputs to generate high-fidelity symbol images; these are then physically consistent with staff layout and rendering via the Smashcima toolkit, forming an end-to-end synthetic pipeline. Contribution/Results: Generated symbols exhibit significantly improved visual fidelity and contextual consistency over prior methods. When used to augment training data, they substantially boost OMR accuracy on real handwritten scores. This work establishes a scalable, interpretable, and high-quality synthetic data paradigm for low-resource OMR tasks.

Technology Category

Application Category

📝 Abstract
The field of Optical Music Recognition (OMR) is currently hindered by the scarcity of real annotated data, particularly when dealing with handwritten historical musical scores. In similar fields, such as Handwritten Text Recognition, it was proven that synthetic examples produced with image generation techniques could help to train better-performing recognition architectures. This study explores the generation of realistic, handwritten-looking scores by implementing a music symbol-level Generative Adversarial Network (GAN) and assembling its output into a full score using the Smashcima engraving software. We have systematically evaluated the visual fidelity of these generated samples, concluding that the generated symbols exhibit a high degree of realism, marking significant progress in synthetic score generation.
Problem

Research questions and friction points this paper is trying to address.

Addresses handwritten musical symbol scarcity in OMR
Generates synthetic handwritten scores using GANs
Enhances training data for optical music recognition
Innovation

Methods, ideas, or system contributions that make the work stand out.

GAN generates handwritten musical symbols
Assembles symbols into scores using Smashcima
Systematically evaluates visual fidelity of output
🔎 Similar Papers
No similar papers found.
G
Gerard Asbert
Computer Vision Center, Barcelona, Spain
P
Pau Torras
Computer Vision Center, Barcelona, Spain
L
Lei Kang
Computer Vision Center, Barcelona, Spain
A
Alicia Fornés
Computer Vision Center, Barcelona, Spain
Josep Lladós
Josep Lladós
Computer Vision Center, Universitat Autònoma de Barcelona
Computer VisionPattern RecognitionDocument Analysis