Generalizable speech deepfake detection via meta-learned LoRA

📅 2025-02-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the poor generalization of deepfake speech detection models to unseen attack types. We propose a meta-learning framework specifically designed for cross-attack generalization. To our knowledge, this is the first approach that integrates Model-Agnostic Meta-Learning (MAML) with Low-Rank Adaptation (LoRA) to model shared representational structures across diverse synthetic speech sources, enabling zero-shot transfer detection without access to target-attack samples. Our method jointly optimizes speech representation fine-tuning and multi-source forgery distribution modeling, substantially improving robustness against previously unseen generators—including those trained with different random seeds. On standard benchmarks, it achieves a 12.7% average accuracy gain over unknown attacks, outperforming conventional fine-tuning and ensemble baselines. The framework establishes a novel, transferable, and lightweight generalization paradigm for deepfake detection.

Technology Category

Application Category

📝 Abstract
Generalizable deepfake detection can be formulated as a detection problem where labels (bonafide and fake) are fixed but distributional drift affects the deepfake set. We can always train our detector with one-selected attacks and bonafide data, but an attacker can generate new attacks by just retraining his generator with a different seed. One reasonable approach is to simply pool all different attack types available in training time. Our proposed approach is to utilize meta-learning in combination with LoRA adapters to learn the structure in the training data that is common to all attack types.
Problem

Research questions and friction points this paper is trying to address.

Generalizable deepfake detection
Meta-learned LoRA adapters
Handling distributional drift in attacks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Meta-learning enhances detection
LoRA adapters improve generalization
Common structure across attack types
🔎 Similar Papers
J
Janne Laakkonen
School of Computing, University of Eastern Finland, Joensuu, Finland
Ivan Kukanov
Ivan Kukanov
KLASS Solutions and Engineering, Singapore
Deep LearningSpeech RecognitionAcoustic Event DetectionAntispoofingDeepfake Detection
V
Ville Hautamaki
School of Computing, University of Eastern Finland, Joensuu, Finland