๐ค AI Summary
Traditional deepfake detectors rely on binary classification, exhibiting poor generalization to unseen generation techniques. This work proposes a differential anomaly detection framework designed for cross-technique generalization: it reframes deepfake detection as identifying anomalies in natural variation patterns between two face images of the same identity, abandoning supervised binary classification. Key contributions include: (i) the first introduction of the differential anomaly detection paradigm for deepfake detection; (ii) a pseudo-deepfakeโdriven feature extractor that jointly models global and local forgery cues; and (iii) an integrated pipeline combining deep face embeddings, differential feature learning, self-supervised pseudo-sample generation, and deep anomaly detection (e.g., Deep SVDD). Evaluated on five mainstream benchmarks, the method achieves or surpasses state-of-the-art performance, significantly improves detection accuracy on unknown generation techniques, and demonstrates markedly enhanced robustness.
๐ Abstract
Traditional deepfake detectors have dealt with the detection problem as a binary classification task. This approach can achieve satisfactory results in cases where samples of a given deepfake generation technique have been seen during training, but can easily fail with deepfakes generated by other techniques. In this paper, we propose DiffFake, a novel deepfake detector that approaches the detection problem as an anomaly detection task. Specifically, DiffFake learns natural changes that occur between two facial images of the same person by leveraging a differential anomaly detection framework. This is done by combining pairs of deep face embeddings and using them to train an anomaly detection model. We further propose to train a feature extractor on pseudo-deepfakes with global and local artifacts, to extract meaningful and generalizable features that can then be used to train the anomaly detection model. We perform extensive experiments on five different deepfake datasets and show that our method can match and sometimes even exceed the performance of state-of-the-art competitors.