ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the challenge that large language models struggle to anticipate the computational demands of tasks under strict global token budgets, leading to inefficient allocation of inference resources. To tackle this, the authors propose ROI-Reasoning, a novel framework that formulates budget-constrained reasoning as an ordered stochastic multiple-choice knapsack problem. By integrating an intrinsic metacognitive mechanism—trained via metacognitive fine-tuning and rationality-aware reinforcement learning—the model learns to estimate task difficulty and expected utility prior to inference, enabling strategic computation allocation. Empirical evaluations demonstrate that this approach significantly enhances overall performance across multiple mathematical reasoning benchmarks and substantially reduces decision regret under tight computational budgets.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) can achieve strong reasoning performance with sufficient computation, but they do not inherently know how much computation a task requires. We study budgeted inference-time reasoning for multiple tasks under a strict global token constraint and formalize it as a Ordered Stochastic Multiple-Choice Knapsack Problem(OS-MCKP). This perspective highlights a meta-cognitive requirement -- anticipating task difficulty, estimating return over investment (ROI), and allocating computation strategically. We propose ROI-Reasoning, a two-stage framework that endows LLMs with intrinsic, budget-aware rationality. In the first stage, Meta-Cognitive Fine-Tuning teaches models to predict reasoning cost and expected utility before generation, enabling explicit solve-or-skip decisions. Next, Rationality-Aware Reinforcement Learning optimizes sequential decision making under a hard token budget, allowing models to learn long-horizon allocation strategies. Across budgeted mathematical reasoning benchmarks, ROI-Reasoning consistently improves overall score while substantially reducing regret under tight computation budgets.

Problem

Research questions and friction points this paper is trying to address.

budgeted inference

computation allocation

task difficulty estimation

token constraint

reasoning efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

ROI-Reasoning

Meta-Cognitive Fine-Tuning

Rationality-Aware Reinforcement Learning