Do LLMs Really Think Step-by-step In Implicit Reasoning?

📅 2024-11-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether large language models (LLMs) genuinely perform stepwise reasoning under “implicit chain-of-thought” (CoT) prompting—i.e., without explicit intermediate reasoning steps in the prompt. Method: Leveraging hidden-state probing and intermediate representation analysis, we systematically compare implicit CoT under prompting versus fine-tuning paradigms across diverse input formats. Contribution/Results: We provide the first empirical evidence that prompting-based implicit CoT fails to activate identifiable intermediate reasoning pathways, instead relying predominantly on experience-driven pattern matching; in contrast, fine-tuned implicit CoT successfully encodes genuine stepwise computation, markedly enhancing reasoning structure interpretability. Furthermore, both paradigms exhibit high sensitivity to input formatting, exposing a fundamental limitation of implicit CoT: its fragility and lack of robust internal reasoning scaffolding. These findings offer critical empirical grounding and theoretical insights for understanding implicit reasoning mechanisms and designing more robust reasoning alignment methods.

Technology Category

Application Category

📝 Abstract
It has been well-known that Chain-of-Thought can remarkably enhance LLMs' performance on complex tasks. However, because it also introduces slower inference speeds and higher computational costs, many researches have attempted to use implicit CoT, which does not need LLMs to explicitly generate the intermediate steps. However, the invisible reasoning process leaves us a doubt that, can implicit CoT really be equal to explicit CoT? Therefore, in this study, we address this question through experiments. We probe the information of intermediate steps from the model's hidden states when it is either trained or prompted to perform implicit CoT. The results surprisingly indicate that when prompted, LLMs hardly think about intermediate steps, suggesting they may just rely on experience rather than strict step-by-step reasoning. But when trained, they indeed calculate intermediate steps. Moreover, in both situations, we find the effect of using implicit CoT is susceptible to the format of the problem, reaffirming the current deficiency of implicit CoT.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Implicit Chain of Thought
Problem Solving Effectiveness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models
Implicit Chain of Thought
Problem Formulation Influence
🔎 Similar Papers
No similar papers found.