Time-Shifted Token Scheduling for Symbolic Music Generation

๐Ÿ“… 2025-09-28
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
In symbolic music generation, fine-grained tokenization ensures high-fidelity modeling at the cost of computational efficiency, whereas compact (compound) tokenization improves decoding speed but hinders intra-token dependency modeling. To address this trade-off, we propose a dynamic positional (DP) token scheduling mechanism that autoregressively unfolds compound tokens during decoding, enabling explicit modeling of internal token structure without additional parameters. DP integrates seamlessly into existing representation frameworks via a delayed scheduling strategy. Experiments on a symphonic MIDI dataset demonstrate that our method significantly enhances musical structural coherence and generation quality over standard compound tokenization, while substantially narrowing the performance gap with fine-grained tokenization. Crucially, DP achieves this improvement without compromising decoding efficiencyโ€”thus reconciling high-fidelity modeling with scalable inference.

Technology Category

Application Category

๐Ÿ“ Abstract
Symbolic music generation faces a fundamental trade-off between efficiency and quality. Fine-grained tokenizations achieve strong coherence but incur long sequences and high complexity, while compact tokenizations improve efficiency at the expense of intra-token dependencies. To address this, we adapt a delay-based scheduling mechanism (DP) that expands compound-like tokens across decoding steps, enabling autoregressive modeling of intra-token dependencies while preserving efficiency. Notably, DP is a lightweight strategy that introduces no additional parameters and can be seamlessly integrated into existing representations. Experiments on symbolic orchestral MIDI datasets show that our method improves all metrics over standard compound tokenizations and narrows the gap to fine-grained tokenizations.
Problem

Research questions and friction points this paper is trying to address.

Addressing efficiency-quality trade-off in symbolic music generation
Resolving intra-token dependencies in compact musical tokenizations
Improving autoregressive modeling of compound musical tokens
Innovation

Methods, ideas, or system contributions that make the work stand out.

Delay-based scheduling expands tokens across steps
Lightweight strategy with no extra parameters added
Seamlessly integrates into existing music representations
๐Ÿ”Ž Similar Papers
No similar papers found.
T
Ting-Kang Wang
Graduate Institute of Communication Engineering, National Taiwan University, Taiwan
C
Chih-Pin Tan
Graduate Institute of Communication Engineering, National Taiwan University, Taiwan
Yi-Hsuan Yang
Yi-Hsuan Yang
National Taiwan University
Music information retrievalMusic GenerationMusic ProcessingMusic AIAffective computing