π€ AI Summary
Natural language instructions for kitchen-assistant robots are often abstract, ambiguous, or non-executable, leading to planning failures. Method: We propose the first test-driven BT-LLM co-design framework that tightly integrates large language modelsβ (LLMs) linguistic reasoning with behavior treesβ (BTs) interpretability, formal verifiability, and interruptibility. Guided by test-driven development (TDD), the LLM generates structured, verifiable subtask sequences, enabling a closed-loop pipeline of instruction understanding, action generation, and execution feedback. Contribution/Results: In a 45-participant user study, our approach significantly reduces execution error rates and improves user trust and preference. The implementation is open-sourced; empirical evaluation on real robotic hardware demonstrates strong effectiveness and robustness in practical kitchen scenarios.
π Abstract
Natural language instructions are often abstract and complex, requiring robots to execute multiple subtasks even for seemingly simple queries. For example, when a user asks a robot to prepare avocado toast, the task involves several sequential steps. Moreover, such instructions can be ambiguous or infeasible for the robot or may exceed the robot's existing knowledge. While Large Language Models (LLMs) offer strong language reasoning capabilities to handle these challenges, effectively integrating them into robotic systems remains a key challenge. To address this, we propose BT-ACTION, a test-driven approach that combines the modular structure of Behavior Trees (BT) with LLMs to generate coherent sequences of robot actions for following complex user instructions, specifically in the context of preparing recipes in a kitchen-assistance setting. We evaluated BT-ACTION in a comprehensive user study with 45 participants, comparing its performance to direct LLM prompting. Results demonstrate that the modular design of BT-ACTION helped the robot make fewer mistakes and increased user trust, and participants showed a significant preference for the robot leveraging BT-ACTION. The code is publicly available at https://github.com/1Eggbert7/BT_LLM.