A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems

📅 2025-04-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the evolving reasoning capabilities of large language models (LLMs), aiming to rigorously delineate their fundamental competencies beyond conventional chatbot functionality. We propose the first orthogonal two-dimensional taxonomy: the horizontal axis distinguishes *when* reasoning occurs—*inference-time expansion* versus *training-time acquisition*; the vertical axis differentiates *system architecture*—*monolithic LLMs* versus *tool-augmented or multi-agent systems*—yielding a four-quadrant analytical framework. This taxonomy unifies diverse techniques—including prompt engineering, candidate sampling optimization, supervised fine-tuning (SFT), PPO/GRPO-based reinforcement learning, reasoning-verifier collaboration, and LLM-based debate—and precisely situates landmark works such as DeepSeek-R1, OpenAI Deep Research, and Manus Agent. Our analysis reveals two paradigm shifts: from *inference-time expansion* to *learning-to-reason*, and from single-model reasoning to *agentified workflows*. The framework provides a structured theoretical foundation and technical roadmap for advancing LLM reasoning research.

Technology Category

Application Category

📝 Abstract
Reasoning is a fundamental cognitive process that enables logical inference, problem-solving, and decision-making. With the rapid advancement of large language models (LLMs), reasoning has emerged as a key capability that distinguishes advanced AI systems from conventional models that empower chatbots. In this survey, we categorize existing methods along two orthogonal dimensions: (1) Regimes, which define the stage at which reasoning is achieved (either at inference time or through dedicated training); and (2) Architectures, which determine the components involved in the reasoning process, distinguishing between standalone LLMs and agentic compound systems that incorporate external tools, and multi-agent collaborations. Within each dimension, we analyze two key perspectives: (1) Input level, which focuses on techniques that construct high-quality prompts that the LLM condition on; and (2) Output level, which methods that refine multiple sampled candidates to enhance reasoning quality. This categorization provides a systematic understanding of the evolving landscape of LLM reasoning, highlighting emerging trends such as the shift from inference-scaling to learning-to-reason (e.g., DeepSeek-R1), and the transition to agentic workflows (e.g., OpenAI Deep Research, Manus Agent). Additionally, we cover a broad spectrum of learning algorithms, from supervised fine-tuning to reinforcement learning such as PPO and GRPO, and the training of reasoners and verifiers. We also examine key designs of agentic workflows, from established patterns like generator-evaluator and LLM debate to recent innovations. ...
Problem

Research questions and friction points this paper is trying to address.

Surveying LLM reasoning methods in inference and training regimes
Analyzing standalone LLMs vs. agentic systems with external tools
Exploring learning algorithms and agentic workflow designs for reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Categorizes methods by inference and training regimes
Analyzes input and output level techniques
Explores agentic workflows and learning algorithms