CorrDetail: Visual Detail Enhanced Self-Correction for Face Forgery Detection

📅 2025-07-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current deepfake detection methods suffer from either poor interpretability (in vision-only models) or hallucination-prone inference (in multimodal models). To address these limitations, this paper proposes a vision-detail-enhanced self-correcting framework. Its three core contributions are: (1) an error-guided questioning mechanism that explicitly directs the model to attend to fine-grained forgery cues; (2) a vision fine-grained enhancement module that strengthens representation learning of local texture and geometric inconsistencies; and (3) a bias-aware multimodal fusion strategy coupled with feedback-driven training, jointly suppressing hallucination and mitigating model bias. Evaluated on FaceForensics++ and Celeb-DF benchmarks, the framework achieves state-of-the-art detection accuracy, significantly improves localization precision of forgery artifacts, and demonstrates strong generalization—particularly robust under low-quality and extreme deepfake conditions.

Technology Category

Application Category

📝 Abstract
With the swift progression of image generation technology, the widespread emergence of facial deepfakes poses significant challenges to the field of security, thus amplifying the urgent need for effective deepfake detection.Existing techniques for face forgery detection can broadly be categorized into two primary groups: visual-based methods and multimodal approaches. The former often lacks clear explanations for forgery details, while the latter, which merges visual and linguistic modalities, is more prone to the issue of hallucinations.To address these shortcomings, we introduce a visual detail enhanced self-correction framework, designated CorrDetail, for interpretable face forgery detection. CorrDetail is meticulously designed to rectify authentic forgery details when provided with error-guided questioning, with the aim of fostering the ability to uncover forgery details rather than yielding hallucinated responses. Additionally, to bolster the reliability of its findings, a visual fine-grained detail enhancement module is incorporated, supplying CorrDetail with more precise visual forgery details. Ultimately, a fusion decision strategy is devised to further augment the model's discriminative capacity in handling extreme samples, through the integration of visual information compensation and model bias reduction.Experimental results demonstrate that CorrDetail not only achieves state-of-the-art performance compared to the latest methodologies but also excels in accurately identifying forged details, all while exhibiting robust generalization capabilities.
Problem

Research questions and friction points this paper is trying to address.

Enhancing face forgery detection with visual details
Reducing hallucinations in multimodal detection approaches
Improving interpretability and accuracy in deepfake identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Visual detail enhanced self-correction framework
Fine-grained detail enhancement module
Fusion decision strategy integration
🔎 Similar Papers
B
Binjia Zhou
School of Software Technology, Zhejiang University
H
Hengrui Lou
School of Software Technology, Zhejiang University
Lizhe Chen
Lizhe Chen
Tsinghua University
Computer GraphicsLarge Language ModelHuman-AI Interaction
H
Haoyuan Li
School of Computer Science and Technology, Zhejiang University
D
Dawei Luo
Ant Group
S
Shuai Chen
Ant Group
Jie Lei
Jie Lei
Universitat Politècnica de València
Computer EngineeringElectronic engineering
Z
Zunlei Feng
School of Software Technology, Zhejiang University
Y
Yijun Bei
School of Software Technology, Zhejiang University