Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models

📅 2025-03-31

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses the fundamental trade-off between inference performance and computational cost in large language models (LLMs). To this end, we propose the novel paradigm of “inference economy,” which systematically formalizes the cost–benefit relationship between inference overhead and task utility. Methodologically, we integrate cognitive science’s dual-process theory (System 1/System 2) to establish a three-dimensional research framework encompassing causal diagnosis, behavioral analysis, and optimization pathways. Leveraging inference trajectory monitoring, adaptive token skipping, and compression techniques, our approach enables synergistic efficiency gains across both post-training optimization and test-time inference. We further introduce the first open-source dynamic knowledge base for LLM inference—curating over 100 efficient reasoning methods, clarifying core challenges, and defining standardized evaluation benchmarks. This delivers the first reproducible, scalable roadmap for achieving low-cost, high-quality LLM inference.

Technology Category

Application Category

📝 Abstract

Recent advancements in Large Language Models (LLMs) have significantly enhanced their ability to perform complex reasoning tasks, transitioning from fast and intuitive thinking (System 1) to slow and deep reasoning (System 2). While System 2 reasoning improves task accuracy, it often incurs substantial computational costs due to its slow thinking nature and inefficient or unnecessary reasoning behaviors. In contrast, System 1 reasoning is computationally efficient but leads to suboptimal performance. Consequently, it is critical to balance the trade-off between performance (benefits) and computational costs (budgets), giving rise to the concept of reasoning economy. In this survey, we provide a comprehensive analysis of reasoning economy in both the post-training and test-time inference stages of LLMs, encompassing i) the cause of reasoning inefficiency, ii) behavior analysis of different reasoning patterns, and iii) potential solutions to achieve reasoning economy. By offering actionable insights and highlighting open challenges, we aim to shed light on strategies for improving the reasoning economy of LLMs, thereby serving as a valuable resource for advancing research in this evolving area. We also provide a public repository to continually track developments in this fast-evolving field.

Problem

Research questions and friction points this paper is trying to address.

Balancing performance and computational costs in LLMs

Analyzing inefficiency causes in System 2 reasoning

Exploring solutions for efficient reasoning economy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Balancing performance and computational costs

Analyzing reasoning inefficiency causes and patterns

Providing solutions for efficient reasoning economy

🔎 Similar Papers

Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting