ClonEval: An Open Voice Cloning Benchmark

📅 2025-04-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the lack of unified, fair, and reproducible evaluation standards for voice cloning TTS models by introducing VCTK-Bench—the first open-source voice cloning benchmark. Methodologically, it establishes an end-to-end automated evaluation framework quantifying three core dimensions: speaker similarity, naturalness, and robustness—incorporating ASR-/SSL-based speaker verification, MOS prediction models, adversarial sample generation, and cross-lingual generalization assessment. Key contributions include: (1) a standardized evaluation protocol; (2) a lightweight, open-source Python evaluation library; and (3) a dynamic, transparent, and continuously updated community leaderboard. Experiments across 12 state-of-the-art models demonstrate strong correlation between automatic scores and human MOS ratings (Spearman ρ = 0.92), significantly improving evaluation efficiency and reproducibility.

Technology Category

Application Category

📝 Abstract
We present a novel benchmark for voice cloning text-to-speech models. The benchmark consists of an evaluation protocol, an open-source library for assessing the performance of voice cloning models, and an accompanying leaderboard. The paper discusses design considerations and presents a detailed description of the evaluation procedure. The usage of the software library is explained, along with the organization of results on the leaderboard.
Problem

Research questions and friction points this paper is trying to address.

Evaluating performance of voice cloning TTS models
Providing open-source library for voice cloning assessment
Establishing leaderboard for voice cloning model comparison
Innovation

Methods, ideas, or system contributions that make the work stand out.

Open benchmark for voice cloning models
Evaluation protocol and open-source library
Leaderboard for performance comparison
I
Iwona Christop
Adam Mickiewicz University, Poznań
T
Tomasz Kuczyński
Adam Mickiewicz University, Poznań
Marek Kubis
Marek Kubis
Adam Mickiewicz University in Poznań
discourse analysisdialogue modelingnatural language processingcomputational lexical semantics