🤖 AI Summary
Large language models (LLMs) exhibit limited performance in complex mathematical reasoning, primarily due to the absence of human implicit cognitive processes—such as intent, method selection, and key insights—in their training data, resulting in semantically incoherent reasoning steps.
Method: We propose Insight-Guided Reasoning (IGR), a framework that dynamically injects *structurally grounded, self-generated insights* into the reasoning chain, replacing static prompting schemes. Built upon the test-time scaling framework TBYS, IGR incorporates an automated pipeline for context-aware example collection and filtering, significantly reducing reliance on manual annotation.
Contribution: IGR achieves substantial improvements on high-difficulty mathematical benchmarks—including MATH and AMC23—with an average accuracy gain of +12.7%. It is the first approach to enable *interpretable and intervenable stepwise reasoning*, modeling progressive thought through explicit, insight-augmented transitions. This work establishes a novel paradigm for enhancing LLMs’ complex deductive reasoning capabilities.
📝 Abstract
Large Language Models (LLMs) often exhibit deficiencies with complex reasoning tasks, such as maths, which we attribute to the discrepancy between human reasoning patterns and those presented in the LLMs' training data. When dealing with complex problems, humans tend to think carefully before expressing solutions. However, they often do not articulate their inner thoughts, including their intentions and chosen methodologies. Consequently, critical insights essential for bridging reasoning steps may be absent in training data collected from human sources. To bridge this gap, we proposes inserting emph{insight}s between consecutive reasoning steps, which review the status and initiate the next reasoning steps. Unlike prior prompting strategies that rely on a single or a workflow of static prompts to facilitate reasoning, emph{insight}s are emph{proactively} generated to guide reasoning processes. We implement our idea as a reasoning framework, named emph{Thinking Before You Speak} (TBYS), and design a pipeline for automatically collecting and filtering in-context examples for the generation of emph{insight}s, which alleviates human labeling efforts and fine-tuning overheads. Experiments on challenging mathematical datasets verify the effectiveness of TBYS. Project website: https://gitee.com/jswrt/TBYS