When Should a Robot Think? Resource-Aware Reasoning via Reinforcement Learning for Embodied Robotic Decision-Making

📅 2026-03-17

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the challenge of balancing necessity and efficiency in large language model (LLM) reasoning for embodied agents, where high computational overhead and latency often hinder real-time performance. To this end, the authors propose RARRL, a novel framework that introduces resource-aware adaptive reasoning control into embodied intelligence for the first time. Leveraging hierarchical reinforcement learning, RARRL dynamically determines at the decision level whether to reason, which reasoning role to adopt, and how much computational resource to allocate, thereby enabling efficient coordination between reasoning and action. The approach integrates environmental observations, execution history, and resource states to learn high-level policies, significantly improving task success rates on the ALFRED benchmark while reducing latency and enhancing system robustness.

Technology Category

Application Category

📝 Abstract

Embodied robotic systems increasingly rely on large language model (LLM)-based agents to support high-level reasoning, planning, and decision-making during interactions with the environment. However, invoking LLM reasoning introduces substantial computational latency and resource overhead, which can interrupt action execution and reduce system reliability. Excessive reasoning may delay actions, while insufficient reasoning often leads to incorrect decisions and task failures. This raises a fundamental question for embodied agents: when should the agent reason, and when should it act? In this work, we propose RARRL (Resource-Aware Reasoning via Reinforcement Learning), a hierarchical framework for resource-aware orchestration of embodied agents. Rather than learning low-level control policies, RARRL learns a high-level orchestration policy that operates at the agent's decision-making layer. This policy enables the agent to adaptively determine whether to invoke reasoning, which reasoning role to employ, and how much computational budget to allocate based on current observations, execution history, and remaining resources. Extensive experiments, including evaluations with empirical latency profiles derived from the ALFRED benchmark, show that RARRL consistently improves task success rates while reducing execution latency and enhancing robustness compared with fixed or heuristic reasoning strategies. These results demonstrate that adaptive reasoning control is essential for building reliable and efficient embodied robotic agents.

Problem

Research questions and friction points this paper is trying to address.

embodied robotics

reasoning

resource-aware

decision-making

latency

Innovation

Methods, ideas, or system contributions that make the work stand out.

resource-aware reasoning

reinforcement learning

embodied agents