Towards Trustworthy Audio Deepfake Detection: A Systematic Framework for Diagnosing and Mitigating Gender Bias

📅 2026-05-09

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This study addresses significant fairness disparities in audio deepfake detection systems across gender groups and the lack of systematic diagnosis of their origins. The work proposes the first “diagnose-then-mitigate” framework, revealing that bias stems from acoustic representation differences, gender information leakage in features, and asymmetric evaluation protocols. Based on these insights, the authors introduce gender-specific decision threshold tuning, a novel round-wise fairness regularization method, and an adversarial debiasing strategy. Experiments on the ASVSpoof5 dataset using AASIST and Wav2Vec2+ResNet18 models demonstrate that gender-aware threshold adjustment reduces unfairness by 54%–75% without compromising detection accuracy, and the proposed regularization outperforms existing batch-level approaches. The study underscores the critical role of precise bias diagnosis in selecting effective debiasing interventions.

📝 Abstract

Audio deepfake detection systems are increasingly deployed in high-stakes security applications, yet their fairness across demographic groups remains critically underexamined. Prior work measures gender disparity but does not investigate where it comes from or how to fix it systematically. We present the first diagnosis-first framework that identifies bias source before applying targeted mitigation, evaluated on two models, AASIST and Wav2Vec2+ResNet18, on ASVSpoof5. Our diagnosis shows that bias does not stem from imbalanced training data but from acoustic representation differences, gender leakage in learned features, and structural evaluation asymmetry. We test mitigation strategies across in-processing, post-processing and combined families, including novel methods introduced in this work. Adjusting the decision threshold separately per gender reduces unfairness by 54% to 75% at no cost to detection accuracy, and our new epoch-level fairness regularisation method outperforms existing per-batch approaches. Adversarial debiasing succeeds only when gender leakage is localised, and fails when it is diffuse, an outcome correctly predicted by our diagnosis before training. No single method fully closes the fairness gap, confirming that bias sources must be identified before fixes are applied and that fairer benchmark design is equally important

Problem

Research questions and friction points this paper is trying to address.

audio deepfake detection

gender bias

fairness

bias diagnosis

trustworthy AI

Innovation

Methods, ideas, or system contributions that make the work stand out.

gender bias diagnosis

audio deepfake detection

fairness-aware machine learning