Rehearsal with Auxiliary-Informed Sampling for Audio Deepfake Detection

📅 2025-05-30

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

Audio deepfake detection models suffer from performance degradation when continuously exposed to novel attacks, while existing replay-based continual learning approaches incur catastrophic forgetting due to insufficient sample diversity in memory buffers. Method: We propose a supplementary label-guided diversified replay mechanism. A label generation network dynamically enriches the semantic and acoustic diversity of audio representations stored in the memory buffer, thereby mitigating feature bias and alleviating knowledge forgetting. The method jointly integrates continual learning, audio representation modeling, and replay optimization. Contribution/Results: Evaluated on a five-stage incremental attack benchmark, our approach achieves a mean equal error rate (EER) of 1.953%, substantially outperforming state-of-the-art methods. It introduces the first auxiliary information–driven replay sampling paradigm for audio deepfake detection and releases open-source code.

Technology Category

Application Category

📝 Abstract

The performance of existing audio deepfake detection frameworks degrades when confronted with new deepfake attacks. Rehearsal-based continual learning (CL), which updates models using a limited set of old data samples, helps preserve prior knowledge while incorporating new information. However, existing rehearsal techniques don't effectively capture the diversity of audio characteristics, introducing bias and increasing the risk of forgetting. To address this challenge, we propose Rehearsal with Auxiliary-Informed Sampling (RAIS), a rehearsal-based CL approach for audio deepfake detection. RAIS employs a label generation network to produce auxiliary labels, guiding diverse sample selection for the memory buffer. Extensive experiments show RAIS outperforms state-of-the-art methods, achieving an average Equal Error Rate (EER) of 1.953 % across five experiences. The code is available at: https://github.com/falihgoz/RAIS.

Problem

Research questions and friction points this paper is trying to address.

Detecting new audio deepfake attacks effectively

Preserving prior knowledge in continual learning frameworks

Reducing bias in rehearsal-based sample selection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Rehearsal-based continual learning for detection

Auxiliary labels guide diverse sample selection

Label generation network enhances memory buffer

🔎 Similar Papers

Audio Anti-Spoofing Detection: A Survey

2024-04-22arXiv.orgCitations: 25