🤖 AI Summary
Deepfake detection faces challenges in distinguishing subtle, method-specific artifacts introduced by diverse generation techniques, while existing approaches—often reduced to binary classification—lack sufficient discriminative power.
Method: This paper proposes a dual-supervised fine-grained tuning framework. It adopts DINOv2 as the backbone and introduces a lightweight multi-head LoRA adapter embedded in each Transformer block. A shared branch is further designed to propagate fine-grained manipulation cues, enabling joint optimization of authenticity assessment and forgery-type identification.
Contribution/Results: With only 3.5 million trainable parameters, the framework achieves highly efficient fine-tuning. It attains state-of-the-art (SOTA) or superior detection accuracy on multiple mainstream benchmarks—including FaceForensics++, Celeb-DF, and DFDC—while significantly improving parameter efficiency and cross-dataset generalization capability compared to existing complex models.
📝 Abstract
The proliferation of sophisticated deepfakes poses significant threats to information integrity. While DINOv2 shows promise for detection, existing fine-tuning approaches treat it as generic binary classification, overlooking distinct artifacts inherent to different deepfake methods. To address this, we propose a DeepFake Fine-Grained Adapter (DFF-Adapter) for DINOv2. Our method incorporates lightweight multi-head LoRA modules into every transformer block, enabling efficient backbone adaptation. DFF-Adapter simultaneously addresses authenticity detection and fine-grained manipulation type classification, where classifying forgery methods enhances artifact sensitivity. We introduce a shared branch propagating fine-grained manipulation cues to the authenticity head. This enables multi-task cooperative optimization, explicitly enhancing authenticity discrimination with manipulation-specific knowledge. Utilizing only 3.5M trainable parameters, our parameter-efficient approach achieves detection accuracy comparable to or even surpassing that of current complex state-of-the-art methods.