🤖 AI Summary
Task and Motion Planning (TAMP) faces a fundamental challenge in jointly reasoning over high-level symbolic semantics and low-level continuous control—existing approaches either oversimplify skill primitives or directly regress joint angles, failing to balance abstraction with physical feasibility. Method: We propose a two-tier “program search + meta-optimization” interface framework. It bridges semantic and numerical domains via executable programs, integrating large language model–driven symbolic program synthesis with zeroth-order optimization–based numerical parameter calibration, and introduces a hybrid symbolic-numerical trajectory optimization paradigm. Results: Evaluated on complex object manipulation and hand-drawn tasks, our method significantly outperforms state-of-the-art TAMP approaches, achieving simultaneous improvements in natural language instruction grounding, planning efficiency, and motion physical feasibility.
📝 Abstract
Intelligent interaction with the real world requires robotic agents to jointly reason over high-level plans and low-level controls. Task and motion planning (TAMP) addresses this by combining symbolic planning and continuous trajectory generation. Recently, foundation model approaches to TAMP have presented impressive results, including fast planning times and the execution of natural language instructions. Yet, the optimal interface between high-level planning and low-level motion generation remains an open question: prior approaches are limited by either too much abstraction (e.g., chaining simplified skill primitives) or a lack thereof (e.g., direct joint angle prediction). Our method introduces a novel technique employing a form of meta-optimization to address these issues by: (i) using program search over trajectory optimization problems as an interface between a foundation model and robot control, and (ii) leveraging a zero-order method to optimize numerical parameters in the foundation model output. Results on challenging object manipulation and drawing tasks confirm that our proposed method improves over prior TAMP approaches.