A Generative-AI-Driven Claim Retrieval System Capable of Detecting and Retrieving Claims from Social Media Platforms in Multiple Languages

📅 2025-04-29

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

The proliferation of online misinformation intensifies the burden on fact-checking, particularly due to redundant verification of already-validated claims, which severely degrades response timeliness. To address this, we propose a multilingual claim retrieval system tailored for social media content, supporting cross-lingual semantic matching and interpretable feedback. Our key contribution is a novel two-stage matching paradigm: “LLM filtering + generative summarization”—first employing a lightweight LLM to rapidly prune irrelevant claims, then applying generative summarization to align cross-lingual semantics and produce auditable, traceable matching justifications. The system integrates multilingual embedding-based retrieval with a human-in-the-loop evaluation framework. Experiments demonstrate significant improvements: a 37% reduction in false positives (per human evaluation), a 42% average reduction in verification decision time, and a 29% increase in inter-annotator consistency—thereby enabling high-throughput, auditable, and timely fact-checking.

Technology Category

Application Category

📝 Abstract

Online disinformation poses a global challenge, placing significant demands on fact-checkers who must verify claims efficiently to prevent the spread of false information. A major issue in this process is the redundant verification of already fact-checked claims, which increases workload and delays responses to newly emerging claims. This research introduces an approach that retrieves previously fact-checked claims, evaluates their relevance to a given input, and provides supplementary information to support fact-checkers. Our method employs large language models (LLMs) to filter irrelevant fact-checks and generate concise summaries and explanations, enabling fact-checkers to faster assess whether a claim has been verified before. In addition, we evaluate our approach through both automatic and human assessments, where humans interact with the developed tool to review its effectiveness. Our results demonstrate that LLMs are able to filter out many irrelevant fact-checks and, therefore, reduce effort and streamline the fact-checking process.

Problem

Research questions and friction points this paper is trying to address.

Detecting multilingual claims from social media for fact-checking

Reducing redundant verification of already fact-checked claims

Streamlining fact-checking with LLM-filtered summaries and explanations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLMs to filter irrelevant fact-checks

Generates summaries for faster claim assessment

Supports multiple languages on social media

🔎 Similar Papers

Can tweets predict article retractions? A comparison between human and LLM labelling