DFKI-Speech System for WildSpoof Challenge: A robust framework for SASV In-the-Wild

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the vulnerability of automatic speaker verification (ASV) systems to spoofing attacks in real-world scenarios by proposing an end-to-end joint anti-spoofing and speaker verification framework. The approach integrates self-supervised speech embeddings with graph neural networks for spoof detection and employs a multi-scale lightweight 1D/2D convolutional network for speaker verification. A novel top-3-layer mixture-of-experts mechanism is introduced to fuse high- and low-level features, enhancing spoofing countermeasures, while a contrastive circle loss adaptively weights sample pairs to optimize verification performance. Evaluated on the SASV track of the WildSpoof Challenge, the system demonstrates superior robustness and significantly improved identification accuracy.

Technology Category

Application Category

📝 Abstract
This paper presents the DFKI-Speech system developed for the WildSpoof Challenge under the Spoofing aware Automatic Speaker Verification (SASV) track. We propose a robust SASV framework in which a spoofing detector and a speaker verification (SV) network operate in tandem. The spoofing detector employs a self-supervised speech embedding extractor as the frontend, combined with a state-of-the-art graph neural network backend. In addition, a top-3 layer based mixture-of-experts (MoE) is used to fuse high-level and low-level features for effective spoofed utterance detection. For speaker verification, we adapt a low-complexity convolutional neural network that fuses 2D and 1D features at multiple scales, trained with the SphereFace loss. Additionally, contrastive circle loss is applied to adaptively weight positive and negative pairs within each training batch, enabling the network to better distinguish between hard and easy sample pairs. Finally, fixed imposter cohort based AS Norm score normalization and model ensembling are used to further enhance the discriminative capability of the speaker verification system.
Problem

Research questions and friction points this paper is trying to address.

SASV
spoofing detection
speaker verification
in-the-wild
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spoofing-aware Speaker Verification
Graph Neural Network
Mixture-of-Experts
Contrastive Circle Loss
Self-supervised Embedding
🔎 Similar Papers
No similar papers found.
A
Arnab Das
German Research Center for Artificial Intelligence (DFKI), Berlin, Germany; GretchenAI, Berlin, Germany
Yassine El Kheir
Yassine El Kheir
PhD Researcher, German Research Center for Artificial Intelligence (DFKI) & TU Berlin
Speech Deepfake DetectionSelf-Supervised LearningPronunciation Assessment
E
Enes Erdem Erdogan
German Research Center for Artificial Intelligence (DFKI), Berlin, Germany; Technical University of Berlin, Berlin, Germany
F
Feidi Kallel
German Research Center for Artificial Intelligence (DFKI), Berlin, Germany; Technical University of Berlin, Berlin, Germany
Tim Polzehl
Tim Polzehl
German Research Center for Artificial Intelligence
Speech and Language technology
S
Sebastian Moeller
German Research Center for Artificial Intelligence (DFKI), Berlin, Germany; Technical University of Berlin, Berlin, Germany