Generating Piano Music with Transformers: A Comparative Study of Scale, Data, and Metrics

📅 2025-11-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The relationship between Transformer design choices and generation quality in symbolic music generation remains poorly understood. Method: We introduce the first systematic comparative framework, quantitatively evaluating the impact of datasets, model scale (up to 950M parameters), architectural variants, and multi-scale training strategies on piano music generation. Our evaluation integrates MIDI sequence modeling, human listening studies, and multidimensional automated metrics. Contribution/Results: We uncover for the first time the nonlinear scaling laws governing how model size and data diversity jointly affect generation quality; further, we find limited correlation between standard automatic metrics and human judgments. The best-performing model achieves a 68% misclassification rate in Turing tests—significantly surpassing prior baselines—and demonstrates substantial improvements in stylistic diversity and musicality.

Technology Category

Application Category

📝 Abstract
Although a variety of transformers have been proposed for symbolic music generation in recent years, there is still little comprehensive study on how specific design choices affect the quality of the generated music. In this work, we systematically compare different datasets, model architectures, model sizes, and training strategies for the task of symbolic piano music generation. To support model development and evaluation, we examine a range of quantitative metrics and analyze how well they correlate with human judgment collected through listening studies. Our best-performing model, a 950M-parameter transformer trained on 80K MIDI files from diverse genres, produces outputs that are often rated as human-composed in a Turing-style listening survey.
Problem

Research questions and friction points this paper is trying to address.

Comparing transformer design choices for piano music generation
Evaluating quantitative metrics against human musical judgment
Developing large-scale transformer models for realistic piano composition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer models for symbolic music generation
Systematic comparison of datasets and architectures
Large-scale model trained on diverse MIDI datasets
🔎 Similar Papers
No similar papers found.
J
Jonathan Lehmkuhl
RWTH Aachen University
Á
Ábel Ilyés-Kun
RWTH Aachen University
N
Nico Bremes
RWTH Aachen University
C
Cemhan Kaan Özaltan
RWTH Aachen University
F
Frederik Muthers
RWTH Aachen University
Jiayi Yuan
Jiayi Yuan
Rice University
Machine LearningLarge Language Models