Generalizable Face Forgery Detection via Separable Prompt Learning

📅 2026-04-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

232K/year
🤖 AI Summary
Existing face forgery detection methods predominantly rely on the visual encoder of CLIP while overlooking the guiding role of textual modality, thereby limiting their generalization capability. This work proposes Separable Prompt Learning (SePL), the first approach to treat text prompts as the dominant modality for forgery detection. SePL employs two decoupled prompts to separately model forgery-related and forgery-irrelevant features, integrating cross-modal alignment with a tailored contrastive learning objective to enhance discriminative power. Extensive evaluations across datasets and forgery generation techniques demonstrate that the proposed method significantly outperforms current state-of-the-art approaches, validating its superior generalization and effectiveness.

Technology Category

Application Category

📝 Abstract
Detecting face forgeries using CLIP has recently emerged as a promising and increasingly popular research direction. Owing to its rich visual knowledge acquired through large-scale pretraining, most existing methods typically rely on the visual encoder of CLIP, while paying limited attention to the text modality. Given the instructive nature of the text modality, we posit that it can be leveraged to instruct Deepfake detection with meticulous design. Accordingly, we shift the focus from the visual modality to the text modality and propose a new Separable Prompt Learning strategy (SePL) that enables CLIP to serve as an effective face forgery detector. The core idea of SePL is to disentangle forgery-specific and forgery-irrelevant information in images via two types of prompt learning, with the former enhancing detection. To achieve this disentangle, we describe a cross-modality alignment strategy and a set of dedicated objectives. Extensive experiments demonstrate that, with this simple adaptation, our method achieves competitive and even superior performance compared to other methods under both cross-dataset and cross-method evaluation, highlighting its strong generalizability. The codes have been released at https://github.com/OUC-YER/SePL-DeepfakeDetection
Problem

Research questions and friction points this paper is trying to address.

Face Forgery Detection
Generalizability
Deepfake Detection
Cross-dataset Evaluation
Cross-method Evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Separable Prompt Learning
CLIP
Face Forgery Detection
Cross-modality Alignment
Generalizability