AFSS: Artifact-Focused Self-Synthesis for Mitigating Bias in Audio Deepfake Detection

📅 2026-03-27

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Existing audio deepfake detection methods exhibit limited generalization due to their reliance on specific spoofing datasets. To address this, this work proposes a self-synthesis strategy that requires no pre-collected fake samples. By leveraging self-transformation and self-reconstruction mechanisms, the approach generates pseudo-spoofed audio while preserving speaker identity and semantic content, thereby guiding the model to focus on generation artifacts rather than irrelevant distractors. A same-speaker constraint combined with a learnable reweighted loss dynamically enhances the model’s sensitivity to synthetic traces. Evaluated across seven benchmark datasets, the method achieves an average equal error rate (EER) of 5.45%, with notably low EERs of 1.23% on WaveFake and 2.70% on In-the-Wild, significantly outperforming current state-of-the-art techniques.

Technology Category

Application Category

📝 Abstract

The rapid advancement of generative models has enabled highly realistic audio deepfakes, yet current detectors suffer from a critical bias problem, leading to poor generalization across unseen datasets. This paper proposes Artifact-Focused Self-Synthesis (AFSS), a method designed to mitigate this bias by generating pseudo-fake samples from real audio via two mechanisms: self-conversion and self-reconstruction. The core insight of AFSS lies in enforcing same-speaker constraints, ensuring that real and pseudo-fake samples share identical speaker identity and semantic content. This forces the detector to focus exclusively on generation artifacts rather than irrelevant confounding factors. Furthermore, we introduce a learnable reweighting loss to dynamically emphasize synthetic samples during training. Extensive experiments across 7 datasets demonstrate that AFSS achieves state-of-the-art performance with an average EER of 5.45\%, including a significant reduction to 1.23\% on WaveFake and 2.70\% on In-the-Wild, all while eliminating the dependency on pre-collected fake datasets. Our code is publicly available at https://github.com/NguyenLeHaiSonGit/AFSS.

Problem

Research questions and friction points this paper is trying to address.

audio deepfake detection

bias mitigation

generalization

artifact focus

dataset bias

Innovation

Methods, ideas, or system contributions that make the work stand out.

Audio Deepfake Detection

Bias Mitigation

Self-Synthesis