Veritas: Generalizable Deepfake Detection via Pattern-Aware Reasoning

📅 2025-08-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing deepfake detection methods achieve strong performance on academic benchmarks but suffer from poor generalization due to narrow training data distributions and low-quality test images, limiting their applicability in real-world industrial settings. To address this, we propose an open-domain generalizable detection framework. First, we introduce a pattern-aware reasoning mechanism that integrates human-inspired “planning” and “self-reflection” paradigms. Second, we design a two-stage training pipeline synergizing multimodal large language models, hierarchical evaluation protocols, chain-of-thought reasoning, and self-supervised learning—enabling robust detection across diverse generative models, synthesis techniques, and unseen domains. Evaluated on the HydraFake benchmark, our method significantly outperforms state-of-the-art approaches, particularly demonstrating superior generalization to previously unseen forgery techniques and unknown data domains. Moreover, its decision-making process is inherently transparent and interpretable.

Technology Category

Application Category

📝 Abstract
Deepfake detection remains a formidable challenge due to the complex and evolving nature of fake content in real-world scenarios. However, existing academic benchmarks suffer from severe discrepancies from industrial practice, typically featuring homogeneous training sources and low-quality testing images, which hinder the practical deployments of current detectors. To mitigate this gap, we introduce HydraFake, a dataset that simulates real-world challenges with hierarchical generalization testing. Specifically, HydraFake involves diversified deepfake techniques and in-the-wild forgeries, along with rigorous training and evaluation protocol, covering unseen model architectures, emerging forgery techniques and novel data domains. Building on this resource, we propose Veritas, a multi-modal large language model (MLLM) based deepfake detector. Different from vanilla chain-of-thought (CoT), we introduce pattern-aware reasoning that involves critical reasoning patterns such as "planning" and "self-reflection" to emulate human forensic process. We further propose a two-stage training pipeline to seamlessly internalize such deepfake reasoning capacities into current MLLMs. Experiments on HydraFake dataset reveal that although previous detectors show great generalization on cross-model scenarios, they fall short on unseen forgeries and data domains. Our Veritas achieves significant gains across different OOD scenarios, and is capable of delivering transparent and faithful detection outputs.
Problem

Research questions and friction points this paper is trying to address.

Addressing generalization challenges in real-world deepfake detection
Mitigating discrepancies between academic benchmarks and industrial practice
Detecting unseen forgery techniques and novel data domains
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pattern-aware reasoning with planning and self-reflection
Two-stage training pipeline for MLLM integration
Multi-modal large language model based detector
🔎 Similar Papers
No similar papers found.
Hao Tan
Hao Tan
Adobe Research
Vision and Language3D Multimodal
Jun Lan
Jun Lan
Ant Group
Zichang Tan
Zichang Tan
Previously CASIA, Baidu Inc.;
Computer VisionBiometricsAutonomous DrivingRoboticsMLLM
A
Ajian Liu
MAIS, Institute of Automation, Chinese Academy of Sciences
C
Chuanbiao Song
Ant Group
S
Senyuan Shi
MAIS, Institute of Automation, Chinese Academy of Sciences
H
Huijia Zhu
Ant Group
W
Weiqiang Wang
Ant Group
J
Jun Wan
MAIS, Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences
Zhen Lei
Zhen Lei
Associate Professor, OSCO Research Chair in Off-site Construction
Offsite ConstructionConstruction Engineering and Management