Deepfake Detection via Knowledge Injection

πŸ“… 2025-03-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing deepfake detection methods often neglect real-data priors, limiting cross-domain generalization. This paper proposes the Knowledge Injection Framework (KID), which embeds a plug-and-play multi-task knowledge injection module into Vision Transformer (ViT) backbones to jointly model real and fake distributions. Specifically, KID introduces a coarse-grained forgery localization branch and designs layer-aware suppression and contrastive losses to explicitly balance real and fake knowledge. Notably, it is the first work to systematically inject real-data priors into ViTs via knowledge distillation, enabling flexible adaptation across model scalesβ€”from compact to base-sized ViTs. Evaluated on multiple benchmarks, KID achieves state-of-the-art generalization performance, accelerates training convergence by 23%, and improves zero-shot transfer accuracy by an average of 11.6%.

Technology Category

Application Category

πŸ“ Abstract
Deepfake detection technologies become vital because current generative AI models can generate realistic deepfakes, which may be utilized in malicious purposes. Existing deepfake detection methods either rely on developing classification methods to better fit the distributions of the training data, or exploiting forgery synthesis mechanisms to learn a more comprehensive forgery distribution. Unfortunately, these methods tend to overlook the essential role of real data knowledge, which limits their generalization ability in processing the unseen real and fake data. To tackle these challenges, in this paper, we propose a simple and novel approach, named Knowledge Injection based deepfake Detection (KID), by constructing a multi-task learning based knowledge injection framework, which can be easily plugged into existing ViT-based backbone models, including foundation models. Specifically, a knowledge injection module is proposed to learn and inject necessary knowledge into the backbone model, to achieve a more accurate modeling of the distributions of real and fake data. A coarse-grained forgery localization branch is constructed to learn the forgery locations in a multi-task learning manner, to enrich the learned forgery knowledge for the knowledge injection module. Two layer-wise suppression and contrast losses are proposed to emphasize the knowledge of real data in the knowledge injection module, to further balance the portions of the real and fake knowledge. Extensive experiments have demonstrated that our KID possesses excellent compatibility with different scales of Vit-based backbone models, and achieves state-of-the-art generalization performance while enhancing the training convergence speed.
Problem

Research questions and friction points this paper is trying to address.

Improves deepfake detection by injecting real data knowledge.
Enhances generalization for unseen real and fake data.
Proposes multi-task learning for accurate forgery localization.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-task learning framework for deepfake detection
Knowledge injection module enhances real and fake data modeling
Layer-wise suppression and contrast losses balance knowledge
πŸ”Ž Similar Papers
No similar papers found.
T
Tonghui Li
School of Computer Science and Engineering, Beihang University, China
Yuanfang Guo
Yuanfang Guo
Beihang University
Multimedia securityAI securityGraph Neural NetworksMultimedia processing
Z
Zeming Liu
School of Computer Science and Engineering, Beihang University, China
H
Heqi Peng
School of Computer Science and Engineering, Beihang University, China
Y
Yun-an Wang
School of Computer Science and Engineering, Beihang University, China