Exploring Specular Reflection Inconsistency for Generalizable Face Forgery Detection

📅 2026-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of detecting high-fidelity AI-generated faces—particularly those synthesized by diffusion models—which often evade detection through conventional spatial or frequency-domain features. The authors introduce, for the first time, the physical inconsistency of specular reflections as a universal forensic cue. Leveraging the Phong illumination model and Retinex theory, they rapidly estimate facial albedo and decompose the specular component, then model its inconsistency with respect to both texture and direct illumination. To this end, they propose SRI-Net, a novel architecture featuring a two-stage cross-attention mechanism that effectively captures the complex interplay among reflectance, texture, and lighting. Extensive experiments demonstrate that the method achieves state-of-the-art performance across multiple deepfake datasets, including those generated by diffusion models, significantly outperforming existing detection approaches.

Technology Category

Application Category

📝 Abstract
Detecting deepfakes has become increasingly challenging as forgery faces synthesized by AI-generated methods, particularly diffusion models, achieve unprecedented quality and resolution. Existing forgery detection approaches relying on spatial and frequency features demonstrate limited efficacy against high-quality, entirely synthesized forgeries. In this paper, we propose a novel detection method grounded in the observation that facial attributes governed by complex physical laws and multiple parameters are inherently difficult to replicate. Specifically, we focus on illumination, particularly the specular reflection component in the Phong illumination model, which poses the greatest replication challenge due to its parametric complexity and nonlinear formulation. We introduce a fast and accurate face texture estimation method based on Retinex theory to enable precise specular reflection separation. Furthermore, drawing from the mathematical formulation of specular reflection, we posit that forgery evidence manifests not only in the specular reflection itself but also in its relationship with corresponding face texture and direct light. To address this issue, we design the Specular-Reflection-Inconsistency-Network (SRI-Net), incorporating a two-stage cross-attention mechanism to capture these correlations and integrate specular reflection related features with image features for robust forgery detection. Experimental results demonstrate that our method achieves superior performance on both traditional deepfake datasets and generative deepfake datasets, particularly those containing diffusion-generated forgery faces.
Problem

Research questions and friction points this paper is trying to address.

face forgery detection
specular reflection inconsistency
deepfake
diffusion models
generalizable detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

specular reflection
face forgery detection
Retinex theory
cross-attention mechanism
generalizable detection
Hongyan Fei
Hongyan Fei
Peking University
computer visionbiometrics
Z
Zexi Jia
WeChat AI, Tencent Inc
C
Chuanwei Huang
School of Intelligence Science and Technology, Peking University; State Key Laboratory of General Artificial Intelligence, Peking University
Jinchao Zhang
Jinchao Zhang
WeChat AI - Pattern Recognition Center
Deep LearningNatural Language ProcessingMachine TranslationDialogue System
Jie Zhou
Jie Zhou
Tencent Wechat AI
nlp