🤖 AI Summary
Existing error detection methods predominantly adopt black-box sequence classification, lacking interpretable discrimination between factual inaccuracies and logical fallacies. This paper introduces the first explainable misbelief identification framework integrating argumentation pattern theory with Critical Questions (CQs), jointly modeling classification and question-answering to simultaneously detect factual deviations and inferential flaws while generating natural-language,质疑-style explanations. Key contributions include: (1) establishing an argumentation-structure-based explainable paradigm; (2) releasing NLAS-CQ, the first large-scale corpus of natural-language argumentation with CQs (3,566 argument instances and 4,687 annotated CQ answers); and (3) achieving statistically significant improvements over black-box baselines across multiple metrics—demonstrating both high accuracy and strong interpretability, thereby enhancing user trust and system transparency.
📝 Abstract
Natural language misinformation detection approaches have been, to date, largely dependent on sequence classification methods, producing opaque systems in which the reasons behind classification as misinformation are unclear. While an effort has been made in the area of automated fact-checking to propose explainable approaches to the problem, this is not the case for automated reason-checking systems. In this paper, we propose a new explainable framework for both factual and rational misinformation detection based on the theory of Argumentation Schemes and Critical Questions. For that purpose, we create and release NLAS-CQ, the first corpus combining 3,566 textbook-like natural language argumentation scheme instances and 4,687 corresponding answers to critical questions related to these arguments. On the basis of this corpus, we implement and validate our new framework which combines classification with question answering to analyse arguments in search of misinformation, and provides the explanations in form of critical questions to the human user.