🤖 AI Summary
This study addresses the challenge of jointly ensuring structural controllability and musical coherence in fully automated symbolic piano composition. To this end, we propose the Segmented Full-Score (SFS) model, which decomposes a composition into user-defined structural segments (e.g., verse, chorus) and models each segment separately. SFS introduces segmented factorized representations and selective cross-segment attention to preserve long-term structural consistency while enhancing local melodic fluency. Built upon the Transformer architecture, the model is integrated into a web-based piano roll interface, supporting seed-guided generation, customizable structural templates, and real-time interactive editing. Quantitative and qualitative evaluations demonstrate that SFS significantly outperforms existing methods in structural accuracy and melodic coherence for complete piece generation. The system has been deployed as an online human-AI collaborative composition platform.
📝 Abstract
We propose the Segmented Full-Song Model (SFS) for symbolic full-song generation. The model accepts a user-provided song structure and an optional short seed segment that anchors the main idea around which the song is developed. By factorizing a song into segments and generating each one through selective attention to related segments, the model achieves higher quality and efficiency compared to prior work. To demonstrate its suitability for human-AI interaction, we further wrap SFS into a web application that enables users to iteratively co-create music on a piano roll with customizable structures and flexible ordering.