Enhancing Fake News Video Detection via LLM-Driven Creative Process Simulation

📅 2025-10-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Fake news video detection suffers from scarce and insufficiently diverse training data, as well as spurious pattern biases—rooted in the complex many-to-many mapping between video segments and fabricated events, which existing datasets fail to model realistically. To address this, we propose AgentAug, the first large language model–driven framework for simulating creative misinformation generation processes. AgentAug explicitly models the segment–event many-to-many relationship via a multi-path generative mechanism that emulates four canonical fabrication strategies, and integrates uncertainty-based active learning to enhance both the quality and efficiency of data augmentation. Experiments on two benchmark datasets demonstrate that AgentAug significantly improves the performance of state-of-the-art detection models, effectively mitigating overfitting induced by data scarcity and bias, while substantially boosting model generalization.

Technology Category

Application Category

📝 Abstract
The emergence of fake news on short video platforms has become a new significant societal concern, necessitating automatic video-news-specific detection. Current detectors primarily rely on pattern-based features to separate fake news videos from real ones. However, limited and less diversified training data lead to biased patterns and hinder their performance. This weakness stems from the complex many-to-many relationships between video material segments and fabricated news events in real-world scenarios: a single video clip can be utilized in multiple ways to create different fake narratives, while a single fabricated event often combines multiple distinct video segments. However, existing datasets do not adequately reflect such relationships due to the difficulty of collecting and annotating large-scale real-world data, resulting in sparse coverage and non-comprehensive learning of the characteristics of potential fake news video creation. To address this issue, we propose a data augmentation framework, AgentAug, that generates diverse fake news videos by simulating typical creative processes. AgentAug implements multiple LLM-driven pipelines of four fabrication categories for news video creation, combined with an active learning strategy based on uncertainty sampling to select the potentially useful augmented samples during training. Experimental results on two benchmark datasets demonstrate that AgentAug consistently improves the performance of short video fake news detectors.
Problem

Research questions and friction points this paper is trying to address.

Detecting fake news videos on short video platforms automatically
Overcoming limited training data diversity causing biased detection patterns
Addressing complex many-to-many relationships in fake video creation
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven pipelines simulate creative fabrication processes
Generates diverse fake news videos for data augmentation
Active learning selects useful augmented samples during training
🔎 Similar Papers
No similar papers found.
Y
Yuyan Bu
Media Synthesis and Forensics Lab, Institute of Computing Technology, Chinese Academy of Sciences
Qiang Sheng
Qiang Sheng
Chinese Academy of Sciences
fake news detectionfact checkingLLM safety
Juan Cao
Juan Cao
Professor of Mathematics, Xiamen University
Computer Aided Geometric DesignComputer Graphics
S
Shaofei Wang
Media Synthesis and Forensics Lab, Institute of Computing Technology, Chinese Academy of Sciences
P
Peng Qi
National University of Singapore
Yuhui Shi
Yuhui Shi
Chair Professor, Computer Science and Engineering, Southern University of Science and Technology
Evolutionary ComputationSwarm IntelligenceParticle Swarm Optimization AlgorithmBrain Storm Optimization
B
Beizhe Hu
Media Synthesis and Forensics Lab, Institute of Computing Technology, Chinese Academy of Sciences