🤖 AI Summary
Manual fact verification is inefficient and costly under the deluge of internet-scale data. Method: This paper presents a systematic survey of automated claim verification techniques in the large language model (LLM) era. We propose the first structured, unified analytical framework for LLM-driven verification, with emphasis on the retrieval–reasoning–verification coordination mechanism and the distinct roles of retrieval-augmented generation (RAG) and controllable generation in ensuring factual consistency. Our approach integrates RAG, prompt engineering, supervised fine-tuning, and multi-stage verification pipelines. Contribution/Results: We comprehensively evaluate across 12+ mainstream English datasets and evaluation dimensions. The work delivers a reproducible technical roadmap and methodological foundation for trustworthy AI reasoning, advancing both theoretical understanding and practical implementation of fact-aware LLM systems.
📝 Abstract
The large and ever-increasing amount of data available on the Internet coupled with the laborious task of manual claim and fact verification has sparked the interest in the development of automated claim verification systems. Several deep learning and transformer-based models have been proposed for this task over the years. With the introduction of Large Language Models (LLMs) and their superior performance in several NLP tasks, we have seen a surge of LLM-based approaches to claim verification along with the use of novel methods such as Retrieval Augmented Generation (RAG). In this survey, we present a comprehensive account of recent claim verification frameworks using LLMs. We describe the different components of the claim verification pipeline used in these frameworks in detail including common approaches to retrieval, prompting, and fine-tuning. Finally, we describe publicly available English datasets created for this task.