Prime the search: Using large language models for guiding geometric task and motion planning by warm-starting tree search

📅 2025-06-06

🏛️ The international journal of robotics research

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work addresses multi-object relocalization under movable obstacles, formulated as a geometric Task and Motion Planning (g-TAMP) problem. We propose an LLM-guided search framework: geometric scenes are encoded into predicate-based prompts compatible with large language models (LLMs), enabling the generation of an initial task plan to warm-start Monte Carlo Tree Search (MCTS)—thereby avoiding costly per-node LLM calls. To our knowledge, this is the first approach to employ LLMs for search guidance—rather than end-to-end action generation—in g-TAMP, establishing the “LLM-warm-started MCTS” paradigm. The method achieves superior robustness without sacrificing inference efficiency. Evaluated on six canonical g-TAMP benchmark tasks, it significantly outperforms both traditional search-based planners and state-of-the-art LLM-based planners in success rate and planning efficiency. The source code is publicly available.

Technology Category

Application Category

📝 Abstract

The problem of relocating a set of objects to designated areas amidst movable obstacles can be framed as a Geometric Task and Motion Planning ( g-tamp ), a subclass of task and motion planning problem (TAMP). Traditional approaches to g-tamp have relied either on domain-independent heuristics or on learning from planning experience to guide the search, both of which typically demand significant computational resources or data. In contrast, humans often use common sense to intuitively decide which objects to manipulate in g-tamp problems. Inspired by this, we propose leveraging Large Language Models (LLMs), which have common sense knowledge acquired from internet-scale data, to guide task planning in g-tamp problems. To enable LLMs to perform geometric reasoning, we design a predicate-based prompt that encodes geometric information derived from a motion planning algorithm. We then query the LLM to generate a task plan, which is then used to search for a feasible set of continuous parameters. Since LLM is prone to mistakes, instead of committing to LLM’s outputs we extend Monte Carlo Tree Search (MCTS) to a hybrid action space and use the LLM to guide the search. Unlike the previous approach that calls an LLM at every node and incurs high computational costs, we use it to warm-start the MCTS with the nodes explored in completing the LLM’s task plan. On six different g-tamp problems, we show our method outperforms previous LLM planners and pure search algorithms. Code can be found at https://github.com/iMSquared/prime-the-search .

Problem

Research questions and friction points this paper is trying to address.

Guiding geometric task planning using LLMs

Reducing computational costs in G-TAMP problems

Enhancing search efficiency with LLM warm-starting

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs guide task planning in G-TAMP

Predicate-based prompts enable geometric reasoning

LLM warm-starts hybrid MCTS for efficiency

🔎 Similar Papers

No similar papers found.