FADEL: Uncertainty-aware Fake Audio Detection with Evidential Deep Learning

📅 2025-04-06
🏛️ IEEE International Conference on Acoustics, Speech, and Signal Processing
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current automatic speaker verification (ASV) systems exhibit insufficient robustness against unknown spoofed audio—i.e., out-of-distribution (OOD) attacks. To address this, we propose an uncertainty-aware detection framework grounded in evidential deep learning (EDL). Specifically, we introduce Dirichlet evidence learning—the first application of this paradigm to spoofed audio detection—to explicitly model predictive uncertainty and mitigate the overconfidence inherent in standard softmax outputs, thereby enabling reliable decision-making under OOD conditions. Our method integrates EDL with Dirichlet distribution-based uncertainty quantification. Evaluated on the ASVspoof2019 LA and ASVspoof2021 LA benchmarks, it significantly outperforms existing baselines. Empirical analysis reveals a strong correlation between average predictive uncertainty and equal-error rate (EER), substantiating both the validity and practical utility of our uncertainty estimation approach.

Technology Category

Application Category

📝 Abstract
Recently, fake audio detection has gained significant attention, as advancements in speech synthesis and voice conversion have increased the vulnerability of automatic speaker verification (ASV) systems to spoofing attacks. A key challenge in this task is generalizing models to detect unseen, out-of-distribution (OOD) attacks. Although existing approaches have shown promising results, they inherently suffer from overconfidence issues due to the usage of softmax for classification, which can produce unreliable predictions when encountering unpredictable spoofing attempts. To deal with this limitation, we propose a novel framework called fake audio detection with evidential learning (FADEL). By modeling class probabilities with a Dirichlet distribution, FADEL incorporates model uncertainty into its predictions, thereby leading to more robust performance in OOD scenarios. Experimental results on the ASVspoof2019 Logical Access (LA) and ASVspoof2021 LA datasets indicate that the proposed method significantly improves the performance of baseline models. Furthermore, we demonstrate the validity of uncertainty estimation by analyzing a strong correlation between average uncertainty and equal error rate (EER) across different spoofing algorithms.
Problem

Research questions and friction points this paper is trying to address.

Detecting unseen out-of-distribution fake audio attacks
Addressing overconfidence in softmax-based classification models
Improving uncertainty-aware robustness in spoofing detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses evidential deep learning for uncertainty
Models class probabilities with Dirichlet distribution
Improves robustness in out-of-distribution scenarios
🔎 Similar Papers
Ju Yeon Kang
Ju Yeon Kang
Seoul National University
deep learningspeech signal processing
Ji Won Yoon
Ji Won Yoon
Korea University
Bayesian InferenceInformation SecurityHardware/Physical SecurityStatistical Signal Processing
S
Semin Kim
Department of Electrical and Computer Engineering and INMC, Seoul National University, Seoul, South Korea
M
Mingrui Han
Department of Electrical and Computer Engineering and INMC, Seoul National University, Seoul, South Korea
Nam Soo Kim
Nam Soo Kim
Seoul National University, Department of Electrical and Computer Engineering