Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches

📅 2026-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work presents the first comprehensive survey of deep learning–based singing voice synthesis, a field that has lacked systematic review. It organizes existing approaches into two major paradigms—cascaded and end-to-end—according to task formulation, and provides in-depth analysis of singing modeling and control techniques. The study also consolidates relevant datasets, annotation tools, and evaluation benchmarks. By structuring and synthesizing the core technical landscape, this survey not only fills a critical gap in the literature but also offers researchers and engineers a systematic reference. Furthermore, the authors release curated resources to support and accelerate community-wide progress in this emerging domain.

Technology Category

Application Category

📝 Abstract
Recent advances in singing voice synthesis (SVS) have attracted substantial attention from both academia and industry. With the advent of large language models and novel generative paradigms, producing controllable, high-fidelity singing voices has become an attainable goal. Yet the field still lacks a comprehensive survey that systematically analyzes deep-learning-based singing voice synthesis systems and their enabling technologies. To address the aforementioned issue, this survey first categorizes existing systems by task type and then organizes current architectures into two major paradigms: cascaded and end-to-end approaches. Moreover, we provide an in-depth analysis of core technologies, covering singing modeling and control techniques. Finally, we review relevant datasets, annotation tools, and evaluation benchmarks that support training and assessment. In appendix, we introduce training strategies and further discussion of SVS. This survey provides an up-to-date review of the literature on SVS models, which would be a useful reference for both researchers and engineers. Related materials are available at https://github.com/David-Pigeon/SyntheticSingers.
Problem

Research questions and friction points this paper is trying to address.

singing voice synthesis
deep learning
systematic survey
generative models
voice controllability
Innovation

Methods, ideas, or system contributions that make the work stand out.

singing voice synthesis
deep learning
cascaded architecture
end-to-end model
voice control
🔎 Similar Papers
No similar papers found.