SongSong: A Time Phonograph for Chinese SongCi Music from Thousand of Years Away

📅 2025-04-11

🏛️ AAAI Conference on Artificial Intelligence

📈 Citations: 3

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the challenge of automatically generating ancient Chinese Ci poetry music—such as Song dynasty lyrical songs—which existing models struggle to reproduce due to their distinctive rhythmic structures and stylistic nuances. To this end, we propose SongSong, a novel staged-generation model that first predicts a melody from the textual input of a Song Ci poem, then separately synthesizes vocal and instrumental accompaniment tracks to produce a complete musical piece. Our approach achieves the first high-quality automatic reconstruction of Song Ci music and introduces OpenSongSong, the first large-scale dataset of ancient Song Ci music. Evaluations on 85 previously unseen Ci poems demonstrate that SongSong outperforms mainstream platforms such as Suno and SkyMusic in both subjective listening quality and objective metrics, establishing a new state of the art in this domain.

Technology Category

Application Category

📝 Abstract

Recently, there have been significant advancements in music generation. However, existing models primarily focus on creating modern pop songs, making it challenging to produce ancient music with distinct rhythms and styles, such as ancient Chinese SongCi. In this paper, we introduce SongSong, the first music generation model capable of restoring Chinese SongCi to our knowledge. Our model first predicts the melody from the input SongCi, then separately generates the singing voice and accompaniment based on that melody, and finally combines all elements to create the final piece of music. Additionally, to address the lack of ancient music datasets, we create OpenSongSong, a comprehensive dataset of ancient Chinese SongCi music, featuring 29.9 hours of compositions by various renowned SongCi music masters. To assess SongSong's proficiency in performing SongCi, we randomly select 85 SongCi sentences that were not part of the training set for evaluation against SongSong and music generation platforms such as Suno and SkyMusic. The subjective and objective outcomes indicate that our proposed model achieves leading performance in generating high-quality SongCi music.

Problem

Research questions and friction points this paper is trying to address.

ancient Chinese music

SongCi

music generation

traditional rhythm

cultural heritage

Innovation

Methods, ideas, or system contributions that make the work stand out.

SongCi music generation

melody prediction

singing voice synthesis