Diffusion Reconstruction towards Generalizable Audio Deepfake Detection

📅 2026-04-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

228K/year
🤖 AI Summary
This work addresses the limited generalization of audio deepfake detection models against unseen attacks by proposing a hard example generation mechanism based on diffusion-based reconstruction. The approach synthesizes challenging forged audio samples to augment training data, integrating multi-level feature aggregation with regularization-augmented contrastive learning to enhance the model’s discriminative capability for previously unseen forgery types. Experimental results demonstrate that the proposed framework significantly reduces the average Equal Error Rate (EER) across multiple cross-domain and unseen attack scenarios, outperforming current state-of-the-art baseline methods.
📝 Abstract
Achieving robust generalization against unseen attacks remains a challenge in Audio Deepfake Detection (ADD), driven by the rapid evolution of generative models. To address this, we propose a framework centered on hard sample classification. The core idea is that a model capable of distinguishing challenging hard samples is inherently equipped to handle simpler cases effectively. We investigate multiple reconstruction paradigms, identifying the diffusion-based method as optimal for generating hard samples. Furthermore, we leverage multi-layer feature aggregation and introduce a Regularization-Assisted Contrastive Learning (RACL) objective to enhance generalizability. Experiments demonstrate the superior generalization of our approach, with our best model achieving a significant reduction in the average Equal Error Rate (EER) compared to the baseline.
Problem

Research questions and friction points this paper is trying to address.

Audio Deepfake Detection
Generalization
Unseen Attacks
Generative Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion reconstruction
hard sample generation
audio deepfake detection
contrastive learning
generalizable detection
🔎 Similar Papers
2024-04-22arXiv.orgCitations: 25