🤖 AI Summary
This work addresses the lack of reproducible and robust evaluation benchmarks for Russian speech anti-spoofing. To this end, we introduce RuASD, a novel dataset comprising spoofed utterances generated by 37 Russian text-to-speech and voice cloning systems alongside multi-source genuine speech. The dataset incorporates controlled channel distortions—including reverberation, additive noise/music, and codec transcoding—to simulate realistic distribution shifts encountered in practical deployments. Using this benchmark, we systematically evaluate the performance of lightweight supervised models, graph attention networks, self-supervised learning (SSL) detectors, and large-scale pre-trained systems under both clean and perturbed conditions. RuASD constitutes the first large-scale, multi-source, perturbation-controlled, and reproducible benchmark for Russian anti-spoofing, offering comprehensive insights into the generalization and robustness of current approaches. The dataset is publicly released to foster further research.
📝 Abstract
RuASD (Russian AntiSpoofing Dataset) is a dedicated, reproducible benchmark for Russian-language speech anti-spoofing designed to evaluate both in-domain discrimination and robustness to deployment-style distribution shifts. It combines a large spoof subset synthesized using 37 modern Russian-capable TTS and voice-cloning systems with a bona fide subset curated from multiple heterogeneous open Russian speech corpora, enabling systematic evaluation across diverse data sources. To emulate typical dissemination and channel effects in a controlled and reproducible manner, RuASD includes configurable simulations of platform and transmission distortions, including room reverberation, additive noise/music, and a range of speech-codec transcodings implemented via a unified processing chain. We benchmark a diverse set of publicly available anti-spoofing countermeasures spanning lightweight supervised architectures, graph-attention models, SSL-based detectors, and large-scale pretrained systems, and report reference results on both clean and simulated conditions to characterize robustness under realistic perturbation pipelines. The dataset is publickly available at \href{https://huggingface.co/datasets/MTUCI/RuASD}{\underline{Hugging Face}} and \href{https://modelscope.cn/datasets/lab260/RuASD}{\underline{ModelScope}}.