🤖 AI Summary
The proliferation of online misinformation intensifies the burden on fact-checking, particularly due to redundant verification of already-validated claims, which severely degrades response timeliness. To address this, we propose a multilingual claim retrieval system tailored for social media content, supporting cross-lingual semantic matching and interpretable feedback. Our key contribution is a novel two-stage matching paradigm: “LLM filtering + generative summarization”—first employing a lightweight LLM to rapidly prune irrelevant claims, then applying generative summarization to align cross-lingual semantics and produce auditable, traceable matching justifications. The system integrates multilingual embedding-based retrieval with a human-in-the-loop evaluation framework. Experiments demonstrate significant improvements: a 37% reduction in false positives (per human evaluation), a 42% average reduction in verification decision time, and a 29% increase in inter-annotator consistency—thereby enabling high-throughput, auditable, and timely fact-checking.
📝 Abstract
Online disinformation poses a global challenge, placing significant demands on fact-checkers who must verify claims efficiently to prevent the spread of false information. A major issue in this process is the redundant verification of already fact-checked claims, which increases workload and delays responses to newly emerging claims. This research introduces an approach that retrieves previously fact-checked claims, evaluates their relevance to a given input, and provides supplementary information to support fact-checkers. Our method employs large language models (LLMs) to filter irrelevant fact-checks and generate concise summaries and explanations, enabling fact-checkers to faster assess whether a claim has been verified before. In addition, we evaluate our approach through both automatic and human assessments, where humans interact with the developed tool to review its effectiveness. Our results demonstrate that LLMs are able to filter out many irrelevant fact-checks and, therefore, reduce effort and streamline the fact-checking process.