Increasing the Thinking Budget is Not All You Need

📅 2025-12-22

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

This study investigates how “reasoning budget”—i.e., the number of reasoning steps—affects large language models’ (LLMs) inference performance, challenging the implicit assumption that more budget invariably improves performance. Method: We systematically evaluate diverse configurations of chain-of-thought (CoT), self-consistency (SC), and self-reflection (SR) on standardized reasoning benchmarks. Contribution/Results: We find diminishing marginal returns from extending CoT length, accompanied by substantial computational overhead. Crucially, we propose a “strategy synergy over budget stacking” paradigm: integrating SC and SR under low reasoning budgets consistently outperforms high-budget baselines across multiple tasks—achieving Pareto improvements in both accuracy and computational efficiency. This yields a reproducible, low-cost pathway for efficient LLM inference.

Technology Category

Application Category

📝 Abstract

Recently, a new wave of thinking-capable Large Language Models has emerged, demonstrating exceptional capabilities across a wide range of reasoning benchmarks. Early studies have begun to explore how the amount of compute in terms of the length of the reasoning process, the so-called thinking budget, impacts model performance. In this work, we propose a systematic investigation of the thinking budget as a key parameter, examining its interaction with various configurations such as self-consistency, reflection, and others. Our goal is to provide an informative, balanced comparison framework that considers both performance outcomes and computational cost. Among our findings, we discovered that simply increasing the thinking budget is not the most effective use of compute. More accurate responses can instead be achieved through alternative configurations, such as self-consistency and self-reflection.

Problem

Research questions and friction points this paper is trying to address.

Investigates the impact of thinking budget on model performance.

Explores alternative configurations like self-consistency for better accuracy.

Compares performance outcomes with computational costs systematically.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic investigation of thinking budget interactions

Alternative configurations like self-consistency improve accuracy

Balanced framework comparing performance and computational cost

🔎 Similar Papers

Rational Metareasoning for Large Language Models