Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning

📅 2025-05-22

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

To address the bottlenecks of data scarcity and inadequate reasoning capability in multimodal video misinformation detection, this paper introduces FakeVV—the first large-scale, diverse benchmark comprising over 100,000 video–text pairs. We further propose Fact-R1, a novel three-stage collaborative reinforcement learning framework that uniquely integrates chain-of-thought (CoT) reasoning, direct preference optimization (DPO), and group-relative policy optimization (GRPO). Fact-R1 leverages verifiable reward functions and multimodal alignment modeling to enhance both detection accuracy and interpretability. Experimental results demonstrate that Fact-R1 achieves a 12.7% absolute improvement over state-of-the-art methods on FakeVV. Moreover, it enables fine-grained attribution and human-verifiable reasoning traces, significantly advancing transparency and trustworthiness in multimodal misinformation detection.

Technology Category

Application Category

📝 Abstract

The rapid spread of multimodal misinformation on social media has raised growing concerns, while research on video misinformation detection remains limited due to the lack of large-scale, diverse datasets. Existing methods often overfit to rigid templates and lack deep reasoning over deceptive content. To address these challenges, we introduce FakeVV, a large-scale benchmark comprising over 100,000 video-text pairs with fine-grained, interpretable annotations. In addition, we further propose Fact-R1, a novel framework that integrates deep reasoning with collaborative rule-based reinforcement learning. Fact-R1 is trained through a three-stage process: (1) misinformation long-Chain-of-Thought (CoT) instruction tuning, (2) preference alignment via Direct Preference Optimization (DPO), and (3) Group Relative Policy Optimization (GRPO) using a novel verifiable reward function. This enables Fact-R1 to exhibit emergent reasoning behaviors comparable to those observed in advanced text-based reinforcement learning systems, but in the more complex multimodal misinformation setting. Our work establishes a new paradigm for misinformation detection, bridging large-scale video understanding, reasoning-guided alignment, and interpretable verification.

Problem

Research questions and friction points this paper is trying to address.

Lack of large-scale datasets for video misinformation detection

Existing methods overfit rigid templates without deep reasoning

Need for interpretable multimodal misinformation detection frameworks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale FakeVV benchmark with interpretable annotations

Fact-R1 integrates deep reasoning and rule-based reinforcement learning

Three-stage training: CoT tuning, DPO alignment, GRPO optimization

🔎 Similar Papers

No similar papers found.