🤖 AI Summary
The WildSpoof Challenge addresses the co-evolutionary need for speech spoofing and anti-spoofing in realistic, uncontrolled environments, targeting the joint improvement of text-to-speech (TTS) spoof generation naturalness and speaker-attributed spoof detection (SASV) robustness under diverse acoustic conditions.
Method: It introduces the first in-the-wild benchmark that unifies TTS-based spoof generation and SASV detection within a single evaluation framework, featuring a large-scale, real-world test set with heterogeneous channels, background noise, and speaker variability; participants are required to jointly optimize spoof quality and detection reliability.
Contribution/Results: This is the first benchmark enabling joint modeling and co-evaluation of TTS and SASV tasks, fostering cross-task method integration. It significantly enhances generalization and practical robustness under complex real-world conditions, providing a reproducible, extensible, and open benchmark for speech security research.
📝 Abstract
The WildSpoof Challenge aims to advance the use of in-the-wild data in two intertwined speech processing tasks. It consists of two parallel tracks: (1) Text-to-Speech (TTS) synthesis for generating spoofed speech, and (2) Spoofing-robust Automatic Speaker Verification (SASV) for detecting spoofed speech. While the organizers coordinate both tracks and define the data protocols, participants treat them as separate and independent tasks. The primary objectives of the challenge are: (i) to promote the use of in-the-wild data for both TTS and SASV, moving beyond conventional clean and controlled datasets and considering real-world scenarios; and (ii) to encourage interdisciplinary collaboration between the spoofing generation (TTS) and spoofing detection (SASV) communities, thereby fostering the development of more integrated, robust, and realistic systems.