🤖 AI Summary
To address the high computational overhead and inefficiency caused by redundant reasoning steps in Chain-of-Thought (CoT) inference, this paper proposes a layer-wise perplexity-change-based method for identifying critical reasoning steps—introducing, for the first time, dynamic perplexity change as an interpretable, quantitative metric for step importance. The method enables fine-grained step pruning and jointly optimizes few-shot example refinement and critical-step-driven supervised fine-tuning. Evaluated on multiple complex reasoning benchmarks, it compresses CoT inference while maintaining or improving accuracy, achieving average speedups of 37%–52%. Key contributions include: (1) a perplexity-change-driven criterion for critical step identification; (2) a dual-path optimization framework integrating example selection and step-aware fine-tuning; and (3) a lightweight, annotation-free inference compression mechanism. This approach significantly improves the accuracy-efficiency trade-off in CoT reasoning.
📝 Abstract
Chain-of-Thought (CoT) reasoning, which breaks down complex tasks into intermediate reasoning steps, has significantly enhanced the performance of large language models (LLMs) on challenging tasks. However, the detailed reasoning process in CoT often incurs long generation times and high computational costs, partly due to the inclusion of unnecessary steps. To address this, we propose a method to identify critical reasoning steps using perplexity as a measure of their importance: a step is deemed critical if its removal causes a significant increase in perplexity. Our method enables models to focus solely on generating these critical steps. This can be achieved through two approaches: refining demonstration examples in few-shot CoT or fine-tuning the model using selected examples that include only critical steps. Comprehensive experiments validate the effectiveness of our method, which achieves a better balance between the reasoning accuracy and efficiency of CoT.