🤖 AI Summary
To address low multi-task execution efficiency of assistive agents in domestic environments, this paper proposes an LLM-driven joint task planning framework. It leverages large language models (LLMs) with few-shot prompting to achieve zero-shot high-level task anticipation, then uniformly encodes the anticipated multi-task set as a PDDL goal for classical planning—specifically, the FF Planner—to generate a synergistically optimized, fine-grained action sequence. This work establishes the first seamless integration of LLM-based task anticipation with symbolic classical planning, enabling cross-task action coordination without any training data. Evaluated in the VirtualHome simulation environment, the framework reduces task completion time by 31% compared to serial single-task execution baselines, demonstrating its effectiveness in action reuse, temporal optimization, and resource coordination.
📝 Abstract
Assistive agents performing household tasks such as making the bed or cooking breakfast often compute and execute actions that accomplish one task at a time. However, efficiency can be improved by anticipating upcoming tasks and computing an action sequence that jointly achieves these tasks. State-of-the-art methods for task anticipation use data-driven deep networks and Large Language Models (LLMs), but they do so at the level of high-level tasks and/or require many training examples. Our framework leverages the generic knowledge of LLMs through a small number of prompts to perform high-level task anticipation, using the anticipated tasks as goals in a classical planning system to compute a sequence of finer-granularity actions that jointly achieve these goals. We ground and evaluate our framework’s abilities in realistic scenarios in the VirtualHome environment and demonstrate a 31% reduction in execution time compared with a system that does not consider upcoming tasks.