Explainable Deepfake Detection with RL Enhanced Self-Blended Images

📅 2026-01-22

📈 Citations: 0

✨ Influential: 0

career value

231K/year

🤖 AI Summary

This work addresses the limited interpretability of existing deepfake detection methods and the scarcity of high-quality, fine-grained forgery annotations that hinder the application of multimodal large language models (MLLMs). To overcome these challenges, the authors propose an automated chain-of-thought (CoT) data generation framework that integrates self-blended images with reinforcement learning (RL). By leveraging a forgery-localization-oriented reward mechanism and a feedback-driven synthesis strategy, the framework efficiently constructs interpretable training data while significantly reducing annotation costs. Extensive experiments demonstrate that the resulting model achieves performance on par with state-of-the-art methods across multiple cross-domain benchmarks, validating the effectiveness and generalizability of the proposed data generation pipeline and RL-enhanced mechanism.

Technology Category

Application Category

📝 Abstract

Most prior deepfake detection methods lack explainable outputs. With the growing interest in multimodal large language models (MLLMs), researchers have started exploring their use in interpretable deepfake detection. However, a major obstacle in applying MLLMs to this task is the scarcity of high-quality datasets with detailed forgery attribution annotations, as textual annotation is both costly and challenging - particularly for high-fidelity forged images or videos. Moreover, multiple studies have shown that reinforcement learning (RL) can substantially enhance performance in visual tasks, especially in improving cross-domain generalization. To facilitate the adoption of mainstream MLLM frameworks in deepfake detection with reduced annotation cost, and to investigate the potential of RL in this context, we propose an automated Chain-of-Thought (CoT) data generation framework based on Self-Blended Images, along with an RL-enhanced deepfake detection framework. Extensive experiments validate the effectiveness of our CoT data construction pipeline, tailored reward mechanism, and feedback-driven synthetic data generation approach. Our method achieves performance competitive with state-of-the-art (SOTA) approaches across multiple cross-dataset benchmarks. Implementation details are available at https://github.com/deon1219/rlsbi.

Problem

Research questions and friction points this paper is trying to address.

explainable deepfake detection

multimodal large language models

forgery attribution

annotation scarcity

high-fidelity deepfakes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Explainable Deepfake Detection

Reinforcement Learning

Self-Blended Images