Evaluating Generalization and Robustness in Russian Anti-Spoofing: The RuASD Initiative

📅 2026-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of reproducible and robust evaluation benchmarks for Russian speech anti-spoofing. To this end, we introduce RuASD, a novel dataset comprising spoofed utterances generated by 37 Russian text-to-speech and voice cloning systems alongside multi-source genuine speech. The dataset incorporates controlled channel distortions—including reverberation, additive noise/music, and codec transcoding—to simulate realistic distribution shifts encountered in practical deployments. Using this benchmark, we systematically evaluate the performance of lightweight supervised models, graph attention networks, self-supervised learning (SSL) detectors, and large-scale pre-trained systems under both clean and perturbed conditions. RuASD constitutes the first large-scale, multi-source, perturbation-controlled, and reproducible benchmark for Russian anti-spoofing, offering comprehensive insights into the generalization and robustness of current approaches. The dataset is publicly released to foster further research.
📝 Abstract
RuASD (Russian AntiSpoofing Dataset) is a dedicated, reproducible benchmark for Russian-language speech anti-spoofing designed to evaluate both in-domain discrimination and robustness to deployment-style distribution shifts. It combines a large spoof subset synthesized using 37 modern Russian-capable TTS and voice-cloning systems with a bona fide subset curated from multiple heterogeneous open Russian speech corpora, enabling systematic evaluation across diverse data sources. To emulate typical dissemination and channel effects in a controlled and reproducible manner, RuASD includes configurable simulations of platform and transmission distortions, including room reverberation, additive noise/music, and a range of speech-codec transcodings implemented via a unified processing chain. We benchmark a diverse set of publicly available anti-spoofing countermeasures spanning lightweight supervised architectures, graph-attention models, SSL-based detectors, and large-scale pretrained systems, and report reference results on both clean and simulated conditions to characterize robustness under realistic perturbation pipelines. The dataset is publickly available at \href{https://huggingface.co/datasets/MTUCI/RuASD}{\underline{Hugging Face}} and \href{https://modelscope.cn/datasets/lab260/RuASD}{\underline{ModelScope}}.
Problem

Research questions and friction points this paper is trying to address.

anti-spoofing
generalization
robustness
distribution shift
Russian speech
Innovation

Methods, ideas, or system contributions that make the work stand out.

anti-spoofing
distribution shift robustness
speech synthesis detection
reproducible benchmark
channel distortion simulation
🔎 Similar Papers
No similar papers found.
K
Ksenia Lysikova
Moscow Technical University of Communications and Informatics, Moscow, Russia, 111038
Kirill Borodin
Kirill Borodin
MTUCI
deep learning for audiogen AIsafe AI
Grach Mkrtchian
Grach Mkrtchian
MTUCI
Artificial IntelligenceAlgorithmsData Structures