A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics

๐Ÿ“… 2025-02-20
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This paper addresses the insufficient logical coherence across multi-step reasoning and inadequate result accuracy of large language models (LLMs) in mathematical reasoning. We present the first systematic, feedback-driven study of multi-step reasoning tailored to mathematics. Methodologically, we introduce the first taxonomy of step-level and result-level feedback mechanisms, encompassing both training-augmented approaches (e.g., process- or outcome-based reward modeling) and training-free strategies (e.g., frozen-model self-verification and external toolโ€“assisted verification). Our contributions are threefold: (1) we rigorously characterize the applicability boundaries and synergistic paradigms of step- and result-level feedback; (2) we establish a unified evaluation framework that quantitatively analyzes trade-offs among performance, computational cost, and scalability across all strategies; and (3) we lay both theoretical and practical foundations for building robust, efficient, and scalable mathematical reasoning systems.

Technology Category

Application Category

๐Ÿ“ Abstract
Recent progress in large language models (LLM) found chain-of-thought prompting strategies to improve the reasoning ability of LLMs by encouraging problem solving through multiple steps. Therefore, subsequent research aimed to integrate the multi-step reasoning process into the LLM itself through process rewards as feedback and achieved improvements over prompting strategies. Due to the cost of step-level annotation, some turn to outcome rewards as feedback. Aside from these training-based approaches, training-free techniques leverage frozen LLMs or external tools for feedback at each step to enhance the reasoning process. With the abundance of work in mathematics due to its logical nature, we present a survey of strategies utilizing feedback at the step and outcome levels to enhance multi-step math reasoning for LLMs. As multi-step reasoning emerges a crucial component in scaling LLMs, we hope to establish its foundation for easier understanding and empower further research.
Problem

Research questions and friction points this paper is trying to address.

Enhancing multi-step reasoning in LLMs
Feedback-based strategies for math reasoning
Survey of training and training-free techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Chain-of-thought prompting strategies
Process and outcome rewards
Training-free techniques with frozen LLMs
๐Ÿ”Ž Similar Papers
No similar papers found.