TriDF: Evaluating Perception, Detection, and Hallucination for Interpretable DeepFake Detection

📅 2025-12-11

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

Generative AI has intensified DeepFake security and trust challenges, necessitating explainable detection methods. Method: We introduce TriDF, the first tri-modal (image/video/audio) benchmark for explainable DeepFake detection, encompassing 16 high-fidelity forgery categories. We propose a novel “Perception–Detection–Hallucination” three-dimensional evaluation framework to systematically quantify models’ sensitivity to manipulation artifacts, cross-generator generalization, and explanation reliability. TriDF integrates human fine-grained annotations, multi-modal data construction, and LLM-driven explanation quality assessment. Results: Empirical evaluation on mainstream multimodal large models reveals that strong perceptual capability significantly enhances detection robustness, whereas explanation hallucination severely undermines decision trustworthiness. TriDF establishes the first unified evaluation standard and empirical foundation for explainable DeepFake detection, advancing the paradigm from black-box classification toward evidence-driven decision-making.

Technology Category

Application Category

📝 Abstract

Advances in generative modeling have made it increasingly easy to fabricate realistic portrayals of individuals, creating serious risks for security, communication, and public trust. Detecting such person-driven manipulations requires systems that not only distinguish altered content from authentic media but also provide clear and reliable reasoning. In this paper, we introduce TriDF, a comprehensive benchmark for interpretable DeepFake detection. TriDF contains high-quality forgeries from advanced synthesis models, covering 16 DeepFake types across image, video, and audio modalities. The benchmark evaluates three key aspects: Perception, which measures the ability of a model to identify fine-grained manipulation artifacts using human-annotated evidence; Detection, which assesses classification performance across diverse forgery families and generators; and Hallucination, which quantifies the reliability of model-generated explanations. Experiments on state-of-the-art multimodal large language models show that accurate perception is essential for reliable detection, but hallucination can severely disrupt decision-making, revealing the interdependence of these three aspects. TriDF provides a unified framework for understanding the interaction between detection accuracy, evidence identification, and explanation reliability, offering a foundation for building trustworthy systems that address real-world synthetic media threats.

Problem

Research questions and friction points this paper is trying to address.

Evaluates perception, detection, and hallucination in interpretable DeepFake detection

Assesses model ability to identify manipulation artifacts and provide reliable explanations

Provides a benchmark for understanding detection accuracy and explanation reliability

Innovation

Methods, ideas, or system contributions that make the work stand out.

TriDF benchmark evaluates perception, detection, hallucination

Uses multimodal data covering 16 DeepFake types

Assesses model explanation reliability and artifact identification

🔎 Similar Papers

No similar papers found.