🤖 AI Summary
Current automatic speaker verification (ASV) systems exhibit insufficient robustness against unknown spoofed audio—i.e., out-of-distribution (OOD) attacks. To address this, we propose an uncertainty-aware detection framework grounded in evidential deep learning (EDL). Specifically, we introduce Dirichlet evidence learning—the first application of this paradigm to spoofed audio detection—to explicitly model predictive uncertainty and mitigate the overconfidence inherent in standard softmax outputs, thereby enabling reliable decision-making under OOD conditions. Our method integrates EDL with Dirichlet distribution-based uncertainty quantification. Evaluated on the ASVspoof2019 LA and ASVspoof2021 LA benchmarks, it significantly outperforms existing baselines. Empirical analysis reveals a strong correlation between average predictive uncertainty and equal-error rate (EER), substantiating both the validity and practical utility of our uncertainty estimation approach.
📝 Abstract
Recently, fake audio detection has gained significant attention, as advancements in speech synthesis and voice conversion have increased the vulnerability of automatic speaker verification (ASV) systems to spoofing attacks. A key challenge in this task is generalizing models to detect unseen, out-of-distribution (OOD) attacks. Although existing approaches have shown promising results, they inherently suffer from overconfidence issues due to the usage of softmax for classification, which can produce unreliable predictions when encountering unpredictable spoofing attempts. To deal with this limitation, we propose a novel framework called fake audio detection with evidential learning (FADEL). By modeling class probabilities with a Dirichlet distribution, FADEL incorporates model uncertainty into its predictions, thereby leading to more robust performance in OOD scenarios. Experimental results on the ASVspoof2019 Logical Access (LA) and ASVspoof2021 LA datasets indicate that the proposed method significantly improves the performance of baseline models. Furthermore, we demonstrate the validity of uncertainty estimation by analyzing a strong correlation between average uncertainty and equal error rate (EER) across different spoofing algorithms.