Can Hallucinations Be Useful? Solving Multi-Hop Questions With SLMs By Chaining System-I/II Reasoning

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work addresses the performance degradation of small language models (SLMs) in multi-hop reasoning caused by early hallucinations and the limitations imposed by the conventional “think-then-answer” paradigm. Inspired by dual-process theories in cognitive science, the authors propose a novel “answer-then-reason” framework: System-I rapidly generates an initial answer, which is then treated as a hypothesis to guide the retrieval of relevant evidence; subsequently, System-II performs deep reasoning to refine and correct this hypothesis, establishing a synergistic two-system reasoning mechanism. The study provides the first systematic demonstration that hallucinations, under specific conditions, can serve as useful anchors to approximate correct answers. Experimental results show that the proposed framework significantly outperforms existing methods across multiple multi-hop question answering benchmarks, effectively enhancing the complex reasoning capabilities of SLMs.

📝 Abstract

Recently, there has been increased interest in Small Language Models (SLMs), which are fast, show good performance, and have lower hardware demands than large language models (LLMs). However, SLMs hallucinate more frequently than LLMs, impacting their ability to solve complex multi-step reasoning problems as early mistakes cascade to the final response. To address this, existing works think-first followed by iterative retrieval to reduce hallucination. We argue that the think-first strategy is not always necessary as we find that: (i) SLMs are often accurately confident in their initial answer and, (ii) hallucinations can actually be beneficial for honing in on the true answer. As such, we position our work as an inversion of this strategy, i.e., answer first-reason later. We propose a cognitively-inspired framework where the model is first allowed to quickly answer the question (System-I (zero-shot)) and then resorts to deeper thinking (System-II) based on evidence retrieved from a knowledge source using the initial hypothesis. By combining System-I and System-II style thinking, we show that our method can outperform prior work that takes the traditional think-first route on various multi-step question-answering benchmarks.

Problem

Research questions and friction points this paper is trying to address.

Small Language Models

Hallucinations

Multi-hop Reasoning

Question Answering

System-I/System-II Reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Small Language Models

Hallucination Utilization

System-I/System-II Reasoning