Debating Truth: Debate-driven Claim Verification with Multiple Large Language Model Agents

📅 2025-07-25

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Addressing the challenge of verifying complex claims under multi-source heterogeneous evidence, this paper proposes the first claim verification framework based on multi-LLM agent debate. It establishes a tripartite collaborative mechanism comprising proponent and opponent debaters alongside a judge, enabling multi-round argumentation to generate interpretable reasoning chains and culminating in a holistic factual assessment by the judge module. To overcome the scarcity of annotated debate data, we innovatively introduce a zero-shot debate data synthesis method. Furthermore, we design a post-training strategy for the judge module to enhance its discriminative capability. Evaluated across diverse evidence quality scenarios, our approach substantially outperforms existing state-of-the-art methods, achieving absolute accuracy improvements of 3.2–5.7 percentage points on benchmarks including FEVER and FEVEROUS. The source code and synthesized dataset are publicly released.

Technology Category

Application Category

📝 Abstract

Claim verification is critical for enhancing digital literacy. However, the state-of-the-art single-LLM methods struggle with complex claim verification that involves multi-faceted evidences. Inspired by real-world fact-checking practices, we propose DebateCV, the first claim verification framework that adopts a debate-driven methodology using multiple LLM agents. In our framework, two Debaters take opposing stances on a claim and engage in multi-round argumentation, while a Moderator evaluates the arguments and renders a verdict with justifications. To further improve the performance of the Moderator, we introduce a novel post-training strategy that leverages synthetic debate data generated by the zero-shot DebateCV, effectively addressing the scarcity of real-world debate-driven claim verification data. Experimental results show that our method outperforms existing claim verification methods under varying levels of evidence quality. Our code and dataset are publicly available at https://anonymous.4open.science/r/DebateCV-6781.

Problem

Research questions and friction points this paper is trying to address.

Verifying complex claims with multi-faceted evidences

Addressing scarcity of real-world debate-driven verification data

Improving claim verification accuracy using multi-agent debates

Innovation

Methods, ideas, or system contributions that make the work stand out.

Debate-driven verification with multiple LLM agents

Multi-round argumentation between opposing debaters

Post-training with synthetic debate data

🔎 Similar Papers

Adversarial Multi-Agent Evaluation of Large Language Models through Iterative Debates