🤖 AI Summary
Deepfake detectors often suffer from poor generalization and unfair performance due to overfitting to generator-specific artifacts and demographic attributes (e.g., gender, race). To address this, we propose Astray-Learning—a novel framework that leverages frequency-domain analysis to disentangle and elevate high-frequency forgery semantics, then implicitly injects them into authentic images to synthesize controllable aberrations. This weakens the model’s reliance on spurious, generator- or demographic-correlated patterns. Our approach introduces the first forgery-semantic elevation and aberration modeling mechanism, coupled with an uncertainty-driven semantic bias suppression strategy, jointly enhancing both generalization and fairness. Evaluated on FaceForensics++ and Celeb-DF, Astray-Learning achieves state-of-the-art performance: it significantly improves cross-generator and cross-dataset robustness while maintaining fairness (ΔFPR < 1.2%) and boosting generalization accuracy.
📝 Abstract
Prior DeepFake detection methods have faced a core challenge in preserving generalizability and fairness effectively. In this paper, we proposed an approach akin to decoupling and sublimating forgery semantics, named astray-learning. The primary objective of the proposed method is to blend hybrid forgery semantics derived from high-frequency components into authentic imagery, named aberrations. The ambiguity of aberrations is beneficial to reducing the model's bias towards specific semantics. Consequently, it can enhance the model's generalization ability and maintain the detection fairness. All codes for astray-learning are publicly available at https://anonymous.4open.science/r/astray-learning-C49B .