π€ AI Summary
This work addresses the limitation of existing embodied agents, which often rely on passive responses and lack high-level value mechanisms to support sustained autonomous behavior and resolve motivational conflicts. To bridge this gap, the authors propose ValuePlanner, a hierarchical cognitive architecture that leverages large language models to reason about abstract value trade-offs and generate symbolic subgoals, which are then translated into executable plans by a PDDL planner. Integrated with closed-loop feedback, this framework enables self-driven, coherent long-horizon behaviors. The study presents the first structured approach linking intrinsic values to embodied actions and introduces an evaluation protocol centered on cumulative value gain, preference alignment, and behavioral diversity. Evaluated in the TongSim household environment, the system significantly outperforms instruction-following and need-driven baselines, demonstrating superior performance in value coordination and behavioral richness.
π Abstract
Current embodied agents are often limited to passive instruction-following or reactive need-satisfaction, lacking a stable, high-order value framework essential for long-term, self-directed behavior and resolving motivational conflicts. We introduce \textit{ValuePlanner}, a hierarchical cognitive architecture that decouples high-level value scheduling from low-level action execution. \textit{ValuePlanner} employs an LLM-based cognitive module to generate symbolic subgoals by reasoning through abstract value trade-offs, which are then translated into executable action plans by a classical PDDL planner. This process is refined via a closed-loop feedback mechanism. Evaluating such autonomy requires methods beyond task-success rates, and we therefore propose a value-centric evaluation suite measuring cumulative value gain, preference alignment, and behavioral diversity. Experiments in the TongSim household environment demonstrate that \textit{ValuePlanner} arbitrates competing values to generate coherent, long-horizon, self-directed behavior absent from instruction-following and needs-driven baselines. Our work offers a structured approach to bridging intrinsic values and grounded behavior for autonomous agents.