π€ AI Summary
Large language models (LLMs) exhibit unreliable reasoning, hallucination-prone outputs, and low accuracy in logical fallacy classification. Method: We propose a lightweight instruction intervention framework inspired by System 2 heuristic reasoning. It constructs a fine-grained, stepwise reasoning instruction dataset and integrates a relation-aware knowledge graphβbased verification mechanism to impose structural constraints and enhance interpretability of the reasoning process. Our approach decomposes fallacy identification into binary subproblems, employs instruction tuning, and tightly couples knowledge graph grounding with inference. Contribution/Results: The method significantly improves accuracy on logical fallacy detection while substantially mitigating hallucinations. It offers a low-cost, highly transparent pathway for neuro-symbolic integration, advancing the trustworthy deployment of LLMs in critical reasoning tasks.
π Abstract
Large Language Models (LLMs) suffer from critical reasoning gaps, including a tendency to hallucinate and poor accuracy in classifying logical fallacies. This limitation stems from their default System 1 processing, which is fast and intuitive, whereas reliable reasoning requires the deliberate, effortful System 2 approach (Kahneman, 2011; Li et al., 2025). Since full System 2 training is often prohibitively expensive, we explore a low-cost, instruction-based intervention to bridge this gap. Our methodology introduces a novel stepwise instruction dataset that decomposes fallacy classification into a series of atomic procedural steps (simple binary questions). We further augment this with a final verification step where models consult a relational knowledge graph of related fallacies. This procedural, rule-based intervention yields a significant improvement in LLM logical fallacy classification. Crucially, the approach also provides enhanced transparency into the LLMs' decision-making, highlighting a practical pathway for Neuro-symbolic architectures to address LLM reasoning deficits.