🤖 AI Summary
To address the lack of verifiable evidence in large language model (LLM)-based fact-checking and the limited coverage and outdatedness of knowledge graphs (KGs), this paper proposes a hybrid fact-checking framework integrating DBpedia KG retrieval, prompt-driven LLM classification, rule-based logical reasoning, and dynamic invocation of web search agents. Its key contribution is a zero-shot fallback mechanism: when both KG and LLM fail to determine veracity, the system automatically triggers real-time web search to retrieve up-to-date evidence—significantly improving verification capability for “information-scarce” claims. Evaluated on the FEVER benchmark’s Supported/Refuted binary classification task, the system achieves an F1 score of 0.93. Further validation via re-annotation confirms its ability to uncover critical evidence missed in the original annotations. The framework thus delivers high accuracy, strong interpretability, and broad coverage without requiring model fine-tuning.
📝 Abstract
Large language models (LLMs) excel in generating fluent utterances but can lack reliable grounding in verified information. At the same time, knowledge-graph-based fact-checkers deliver precise and interpretable evidence, yet suffer from limited coverage or latency. By integrating LLMs with knowledge graphs and real-time search agents, we introduce a hybrid fact-checking approach that leverages the individual strengths of each component. Our system comprises three autonomous steps: 1) a Knowledge Graph (KG) Retrieval for rapid one - hop lookups in DBpedia, 2) an LM-based classification guided by a task-specific labeling prompt, producing outputs with internal rule-based logic, and 3) a Web Search Agent invoked only when KG coverage is insufficient. Our pipeline achieves an F1 score of 0.93 on the FEVER benchmark on the Supported/Refuted split without task- specific fine - tuning. To address Not enough information cases, we conduct a targeted reannotation study showing that our approach frequently uncovers valid evidence for claims originally labeled as Not Enough Information (NEI), as confirmed by both expert annotators and LLM reviewers. With this paper, we present a modular, opensource fact-checking pipeline with fallback strategies and generalization across datasets.