A Survey of Scaling in Large Language Model Reasoning

📅 2025-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Blind scaling of large language models (LLMs)—through increased model size or training data—does not necessarily improve reasoning performance and may degrade logical consistency, robustness, and alignment. Method: The authors propose the first five-dimensional scaling taxonomy for LLM reasoning capabilities, encompassing input context length, reasoning step depth, interaction turn count, training-driven reasoning, and task complexity. They conduct systematic cross-paradigm comparisons and multi-dimensional attribution analysis, integrating empirical evaluations under a unified assessment framework. Contribution/Results: Empirical findings reveal substantial returns from scaling input context and interaction turns, whereas increasing single-step reasoning depth consistently exacerbates hallucination. The study establishes principled, scale–task co-design guidelines and an evolutionary roadmap for building trustworthy AI reasoning systems, grounded in empirically validated trade-offs across scaling dimensions.

Technology Category

Application Category

📝 Abstract
The rapid advancements in large Language models (LLMs) have significantly enhanced their reasoning capabilities, driven by various strategies such as multi-agent collaboration. However, unlike the well-established performance improvements achieved through scaling data and model size, the scaling of reasoning in LLMs is more complex and can even negatively impact reasoning performance, introducing new challenges in model alignment and robustness. In this survey, we provide a comprehensive examination of scaling in LLM reasoning, categorizing it into multiple dimensions and analyzing how and to what extent different scaling strategies contribute to improving reasoning capabilities. We begin by exploring scaling in input size, which enables LLMs to process and utilize more extensive context for improved reasoning. Next, we analyze scaling in reasoning steps that improves multi-step inference and logical consistency. We then examine scaling in reasoning rounds, where iterative interactions refine reasoning outcomes. Furthermore, we discuss scaling in training-enabled reasoning, focusing on optimization through iterative model improvement. Finally, we review applications of scaling across domains and outline future directions for further advancing LLM reasoning. By synthesizing these diverse perspectives, this survey aims to provide insights into how scaling strategies fundamentally enhance the reasoning capabilities of LLMs and further guide the development of next-generation AI systems.
Problem

Research questions and friction points this paper is trying to address.

Examining scaling impact on LLM reasoning performance
Analyzing multi-dimensional scaling strategies for reasoning
Exploring alignment challenges in scaled reasoning models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Scaling input size for better context processing
Increasing reasoning steps for logical consistency
Iterative reasoning rounds to refine outcomes
🔎 Similar Papers
No similar papers found.
Z
Zihan Chen
University of Virginia, Charlottesville, VA, USA
S
Song Wang
University of Virginia, Charlottesville, VA, USA
Zhen Tan
Zhen Tan
Ph.D. at Arizona State University
Data MiningMachine LearningAI for ScienceUser-centric ExplanationResponsible AI
Xingbo Fu
Xingbo Fu
University of Virginia
Graph MiningDistributed Machine Learning
Z
Zhenyu Lei
University of Virginia, Charlottesville, VA, USA
P
Peng Wang
University of Virginia, Charlottesville, VA, USA
H
Huan Liu
Arizona State University, Tempe, AZ, USA
C
Cong Shen
University of Virginia, Charlottesville, VA, USA
Jundong Li
Jundong Li
Associate Professor, University of Virginia
AIMachine LearningData MiningGraph Learning