Reasoning with Large Language Models, a Survey

📅 2024-07-16

🏛️ arXiv.org

📈 Citations: 95

✨ Influential: 2

career value

202K/year

🤖 AI Summary

Large language models (LLMs) exhibit limited multi-step reasoning capabilities—e.g., in elementary mathematics, logical deduction, combinatorial games, and robotic planning—when deployed without fine-tuning. Method: We systematically survey prompt-driven reasoning mechanisms, introducing the first structured taxonomy for LLM reasoning; empirically demonstrate that prompts can elicit metacognitive behaviors such as self-reflection and self-correction; formally define “reasoning by LLMs” as a distinct challenge beyond pattern matching; and integrate chain-of-thought prompting, self-consistency decoding, reasoning-path evaluation, and reinforcement-learning-inspired controllable reasoning frameworks. Contribution/Results: Our work clarifies the fundamental boundaries of LLM reasoning, identifies key open challenges, establishes a unified research paradigm, and proposes a verifiable, controllable, and systematic research agenda for advancing reasoning in foundation models.

Technology Category

Application Category

📝 Abstract

Scaling up language models to billions of parameters has opened up possibilities for in-context learning, allowing instruction tuning and few-shot learning on tasks that the model was not specifically trained for. This has achieved breakthrough performance on language tasks such as translation, summarization, and question-answering. Furthermore, in addition to these associative"System 1"tasks, recent advances in Chain-of-thought prompt learning have demonstrated strong"System 2"reasoning abilities, answering a question in the field of artificial general intelligence whether LLMs can reason. The field started with the question whether LLMs can solve grade school math word problems. This paper reviews the rapidly expanding field of prompt-based reasoning with LLMs. Our taxonomy identifies different ways to generate, evaluate, and control multi-step reasoning. We provide an in-depth coverage of core approaches and open problems, and we propose a research agenda for the near future. Finally, we highlight the relation between reasoning and prompt-based learning, and we discuss the relation between reasoning, sequential decision processes, and reinforcement learning. We find that self-improvement, self-reflection, and some metacognitive abilities of the reasoning processes are possible through the judicious use of prompts. True self-improvement and self-reasoning, to go from reasoning with LLMs to reasoning by LLMs, remains future work.

Problem

Research questions and friction points this paper is trying to address.

Enhancing multi-step reasoning in large language models

Evaluating reasoning performance on diverse benchmarks

Developing methods to control and optimize reasoning processes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Chain-of-thought enables multi-step reasoning

Reinforcement learning fine-tunes reasoning methods

External tools execute generated code solutions

🔎 Similar Papers

Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting