Detecting Manipulated Contents Using Knowledge-Grounded Inference

📅 2025-04-29

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

To address the challenge of detecting zero-day manipulated content—unfeasible for static models or manual contextual analysis—this paper proposes a real-time knowledge-injection framework for fake news detection. Our method dynamically retrieves contextual evidence via mainstream search engines and performs knowledge-grounded reasoning using a Retrieval-Augmented Generation (RAG)-enhanced large language model, enabling veracity assessment and interpretable explanations without requiring pre-trained embedded knowledge or human annotation. The core contribution is the first real-time retrieval–reasoning closed-loop mechanism specifically designed for zero-day manipulation. Evaluated on a curated dataset of 4,270 samples, our approach achieves an F1-score of 0.856 and outperforms the current state-of-the-art by 1.9× on established fact-checking benchmarks.

Technology Category

Application Category

📝 Abstract

The detection of manipulated content, a prevalent form of fake news, has been widely studied in recent years. While existing solutions have been proven effective in fact-checking and analyzing fake news based on historical events, the reliance on either intrinsic knowledge obtained during training or manually curated context hinders them from tackling zero-day manipulated content, which can only be recognized with real-time contextual information. In this work, we propose Manicod, a tool designed for detecting zero-day manipulated content. Manicod first sources contextual information about the input claim from mainstream search engines, and subsequently vectorizes the context for the large language model (LLM) through retrieval-augmented generation (RAG). The LLM-based inference can produce a"truthful"or"manipulated"decision and offer a textual explanation for the decision. To validate the effectiveness of Manicod, we also propose a dataset comprising 4270 pieces of manipulated fake news derived from 2500 recent real-world news headlines. Manicod achieves an overall F1 score of 0.856 on this dataset and outperforms existing methods by up to 1.9x in F1 score on their benchmarks on fact-checking and claim verification.

Problem

Research questions and friction points this paper is trying to address.

Detecting zero-day manipulated content in real-time

Overcoming reliance on historical data for fake news detection

Improving accuracy in identifying manipulated claims using LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses search engines for real-time context

Applies RAG for LLM vectorization

LLM-based inference with explanations

🔎 Similar Papers

Exploring Saliency Bias in Manipulation Detection