🤖 AI Summary
Deepfake detection models achieve strong performance on benchmark datasets but suffer from poor cross-dataset generalization—especially under distribution shifts involving unseen forgery types or quality degradations. To address this, we propose a prior-free asymmetric ensemble method that fuses prediction probabilities from multiple state-of-the-art heterogeneous detectors via multi-stage deep fusion and probability-weighted aggregation, enhancing robustness without requiring knowledge of target-domain characteristics. Evaluated systematically on two out-of-distribution datasets, our approach improves average AUC by 3.2–5.7 percentage points over individual models and increases prediction stability by 42%, while maintaining high sensitivity to low-quality and novel forgeries. The core contribution is a lightweight, plug-and-play generalization-enhancement framework that significantly improves adaptability and scalability in real-world deployment scenarios.
📝 Abstract
Machine learning-based Deepfake detection models have achieved impressive results on benchmark datasets, yet their performance often deteriorates significantly when evaluated on out-of-distribution data. In this work, we investigate an ensemble-based approach for improving the generalization of deepfake detection systems across diverse datasets. Building on a recent open-source benchmark, we combine prediction probabilities from several state-of-the-art asymmetric models proposed at top venues. Our experiments span two distinct out-of-domain datasets and demonstrate that no single model consistently outperforms others across settings. In contrast, ensemble-based predictions provide more stable and reliable performance in all scenarios. Our results suggest that asymmetric ensembling offers a robust and scalable solution for real-world deepfake detection where prior knowledge of forgery type or quality is often unavailable.