DeePen: Penetration Testing for Audio Deepfake Detection

📅 2025-02-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the robustness deficiency of audio deepfake detection models against unknown attacks by proposing DeePen—the first black-box penetration testing framework requiring no model access. Methodologically, it employs lightweight time-domain signal transformations—including time-stretching, echo injection, and pitch shifting—to systematically probe vulnerabilities without knowledge of model architecture or parameters. Key contributions include: (1) establishing the first black-box penetration testing paradigm specifically for audio deepfake detectors; (2) empirically demonstrating that generic signal transformations exhibit cross-model adversarial transferability; and (3) identifying certain attacks as inherently resistant to defensive distillation. Experiments show DeePen achieves up to 98.3% evasion success across multiple industrial and academic detection systems, confirming that simple, imperceptible perturbations can effectively bypass state-of-the-art detectors. The complete toolkit—including attack implementations and a standardized benchmark—is publicly released.

Technology Category

Application Category

📝 Abstract
Deepfakes - manipulated or forged audio and video media - pose significant security risks to individuals, organizations, and society at large. To address these challenges, machine learning-based classifiers are commonly employed to detect deepfake content. In this paper, we assess the robustness of such classifiers through a systematic penetration testing methodology, which we introduce as DeePen. Our approach operates without prior knowledge of or access to the target deepfake detection models. Instead, it leverages a set of carefully selected signal processing modifications - referred to as attacks - to evaluate model vulnerabilities. Using DeePen, we analyze both real-world production systems and publicly available academic model checkpoints, demonstrating that all tested systems exhibit weaknesses and can be reliably deceived by simple manipulations such as time-stretching or echo addition. Furthermore, our findings reveal that while some attacks can be mitigated by retraining detection systems with knowledge of the specific attack, others remain persistently effective. We release all associated code.
Problem

Research questions and friction points this paper is trying to address.

Assessing robustness of deepfake detection classifiers
Evaluating vulnerabilities using signal processing attacks
Identifying persistent weaknesses in detection systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Penetration testing for deepfake detection robustness
Signal processing modifications to evaluate vulnerabilities
Analysis of real-world and academic model weaknesses
🔎 Similar Papers
No similar papers found.