🤖 AI Summary
Current misinformation detection research prioritizes accuracy over explainability and user cognitive intervention, while reasoning resources generated by multi-agent debate (MAD) remain underutilized for persuasive intervention. To address this, we propose ED²D—a novel framework that pioneers evidence-based multi-agent debate for interpretable refutation and cognitive recalibration against misinformation. ED²D integrates factual retrieval, controllable natural language generation, and persuasion mechanism design to realize end-to-end, evidence-driven debate. Methodologically, it orchestrates specialized agents to collaboratively construct logically grounded, human-aligned rebuttals. Empirically, ED²D achieves state-of-the-art detection performance across three benchmarks and generates expert-level explanatory rebuttals that enhance users’ future discernment capability. Crucially, it reveals how flawed interpretability—e.g., misleading misclassification explanations—can exacerbate misconceptions, thereby advocating a paradigm shift in misinformation governance: from passive detection toward active intervention and persuasion. The code, models, and public platform are fully open-sourced.
📝 Abstract
Multi-agent debate (MAD) frameworks have emerged as promising approaches for misinformation detection by simulating adversarial reasoning. While prior work has focused on detection accuracy, it overlooks the importance of helping users understand the reasoning behind factual judgments and develop future resilience. The debate transcripts generated during MAD offer a rich but underutilized resource for transparent reasoning. In this study, we introduce ED2D, an evidence-based MAD framework that extends previous approach by incorporating factual evidence retrieval. More importantly, ED2D is designed not only as a detection framework but also as a persuasive multi-agent system aimed at correcting user beliefs and discouraging misinformation sharing. We compare the persuasive effects of ED2D-generated debunking transcripts with those authored by human experts. Results demonstrate that ED2D outperforms existing baselines across three misinformation detection benchmarks. When ED2D generates correct predictions, its debunking transcripts exhibit persuasive effects comparable to those of human experts; However, when ED2D misclassifies, its accompanying explanations may inadvertently reinforce users'misconceptions, even when presented alongside accurate human explanations. Our findings highlight both the promise and the potential risks of deploying MAD systems for misinformation intervention. We further develop a public community website to help users explore ED2D, fostering transparency, critical thinking, and collaborative fact-checking.