🤖 AI Summary
Current generative music modeling suffers from a lack of high-quality, open-source, authentic popular music datasets; mainstream resources rely on synthetic audio, re-recordings, or unscreened large-scale audio corpora, resulting in semantically impoverished content, stylistic distortion, and low community adoption. Method: We introduce the first large-scale, open-source dataset specifically designed for generative music modeling, systematically curating over 9 million authentic, commercially released popular songs—including works by globally renowned artists—and supporting diverse tasks including text-to-music generation, singing voice synthesis, melody reconstruction, and cross-modal retrieval. Our approach features copyright-compliant sampling, multi-source metadata alignment, and dual-stage quality filtering based on both audio fidelity and musical structure. Contribution/Results: The dataset enables robust cross-modal (text/audio/score) joint modeling and empirically yields substantial improvements in generation naturalness and stylistic consistency, advancing state-of-the-art performance across multiple benchmarks.
📝 Abstract
We present Sleeping-DISCO 9M, a large-scale pre-training dataset for music and song. To the best of our knowledge, there are no open-source high-quality dataset representing popular and well-known songs for generative music modeling tasks such as text-music, music-captioning, singing-voice synthesis, melody reconstruction and cross-model retrieval. Past contributions focused on isolated and constrained factors whose core perspective was to create synthetic or re-recorded music corpus (e.g. GTSinger, M4Singer) and arbitrarily large-scale audio datasets (e.g. DISCO-10M and LAIONDISCO-12M) had been another focus for the community. Unfortunately, adoption of these datasets has been below substantial in the generative music community as these datasets fail to reflect real-world music and its flavour. Our dataset changes this narrative and provides a dataset that is constructed using actual popular music and world-renowned artists.