🤖 AI Summary
To address the proliferation of multimodal misinformation and the inability of existing methods to model coordinated image-text deceptive patterns, this paper proposes Holmes, an end-to-end automated fact-checking framework. Holmes innovatively integrates LLM-driven claim summarization, a novel evidence quality metric and ranking algorithm, multi-stage prompt engineering, and chain-of-thought reasoning—enabling, for the first time, a closed-loop “retrieve–verify–attribute” pipeline. By enhancing both accuracy and interpretability of evidence retrieval, Holmes improves evidence recall precision by 30.8%. It achieves 88.3% accuracy on two mainstream open-source multimodal datasets and 90.2% on real-time fact-checking tasks, significantly outperforming state-of-the-art methods. Its core contribution lies in establishing a new paradigm for multimodal fact-checking that is trustworthy, interpretable, and highly robust.
📝 Abstract
The rise of Internet connectivity has accelerated the spread of disinformation, threatening societal trust, decision-making, and national security. Disinformation has evolved from simple text to complex multimodal forms combining images and text, challenging existing detection methods. Traditional deep learning models struggle to capture the complexity of multimodal disinformation. Inspired by advances in AI, this study explores using Large Language Models (LLMs) for automated disinformation detection. The empirical study shows that (1) LLMs alone cannot reliably assess the truthfulness of claims; (2) providing relevant evidence significantly improves their performance; (3) however, LLMs cannot autonomously search for accurate evidence. To address this, we propose Holmes, an end-to-end framework featuring a novel evidence retrieval method that assists LLMs in collecting high-quality evidence. Our approach uses (1) LLM-powered summarization to extract key information from open sources and (2) a new algorithm and metrics to evaluate evidence quality. Holmes enables LLMs to verify claims and generate justifications effectively. Experiments show Holmes achieves 88.3% accuracy on two open-source datasets and 90.2% in real-time verification tasks. Notably, our improved evidence retrieval boosts fact-checking accuracy by 30.8% over existing methods