AFSS: Artifact-Focused Self-Synthesis for Mitigating Bias in Audio Deepfake Detection

πŸ“… 2026-03-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing audio deepfake detection methods exhibit limited generalization due to their reliance on specific spoofing datasets. To address this, this work proposes a self-synthesis strategy that requires no pre-collected fake samples. By leveraging self-transformation and self-reconstruction mechanisms, the approach generates pseudo-spoofed audio while preserving speaker identity and semantic content, thereby guiding the model to focus on generation artifacts rather than irrelevant distractors. A same-speaker constraint combined with a learnable reweighted loss dynamically enhances the model’s sensitivity to synthetic traces. Evaluated across seven benchmark datasets, the method achieves an average equal error rate (EER) of 5.45%, with notably low EERs of 1.23% on WaveFake and 2.70% on In-the-Wild, significantly outperforming current state-of-the-art techniques.
πŸ“ Abstract
The rapid advancement of generative models has enabled highly realistic audio deepfakes, yet current detectors suffer from a critical bias problem, leading to poor generalization across unseen datasets. This paper proposes Artifact-Focused Self-Synthesis (AFSS), a method designed to mitigate this bias by generating pseudo-fake samples from real audio via two mechanisms: self-conversion and self-reconstruction. The core insight of AFSS lies in enforcing same-speaker constraints, ensuring that real and pseudo-fake samples share identical speaker identity and semantic content. This forces the detector to focus exclusively on generation artifacts rather than irrelevant confounding factors. Furthermore, we introduce a learnable reweighting loss to dynamically emphasize synthetic samples during training. Extensive experiments across 7 datasets demonstrate that AFSS achieves state-of-the-art performance with an average EER of 5.45\%, including a significant reduction to 1.23\% on WaveFake and 2.70\% on In-the-Wild, all while eliminating the dependency on pre-collected fake datasets. Our code is publicly available at https://github.com/NguyenLeHaiSonGit/AFSS.
Problem

Research questions and friction points this paper is trying to address.

audio deepfake detection
bias mitigation
generalization
artifact focus
dataset bias
Innovation

Methods, ideas, or system contributions that make the work stand out.

Audio Deepfake Detection
Bias Mitigation
Self-Synthesis
Generation Artifacts
Same-Speaker Constraint
πŸ”Ž Similar Papers
No similar papers found.
H
Hai-Son Nguyen-Le
University of Science, Ho Chi Minh City, Vietnam
H
Hung-Cuong Nguyen-Thanh
University of Science, Ho Chi Minh City, Vietnam
Nhien-An Le-Khac
Nhien-An Le-Khac
Associate Professor of Digital Forensics and Cyber Security, University College Dublin
Digital ForensicsCybersecurityAI SecurityAI ForensicsKnowledge Engineering
D
Dinh-Thuc Nguyen
University of Science, Ho Chi Minh City, Vietnam
H
Hong-Hanh Nguyen-Le
University College Dublin, School of Computer Science, Dublin, Ireland