Hindsight Planner: A Closed-Loop Few-Shot Planner for Embodied Instruction Following

📅 2024-12-27

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

To address the robust task planning challenge in embodied instruction following under few-shot, partially observable, and dynamic environments, this paper proposes the first few-shot closed-loop task planning framework. Methodologically, we formulate the problem as a partially observable Markov decision process (POMDP), integrate large language models (LLMs) for instruction grounding and action planning, and introduce a novel hindsight reasoning mechanism coupled with a dynamic adaptation module to enable online policy refinement based on real-time state feedback. Our key contributions are threefold: (1) we are the first to surpass fully supervised baselines under strict few-shot settings (≤5 demonstrations); (2) we significantly improve out-of-distribution generalization and state recovery capability; and (3) on the ALFRED benchmark, our method achieves a 23.6% absolute gain in task success rate over prior few-shot approaches—matching or exceeding the performance of fully supervised planners.

Technology Category

Application Category

📝 Abstract

This work focuses on building a task planner for Embodied Instruction Following (EIF) using Large Language Models (LLMs). Previous works typically train a planner to imitate expert trajectories, treating this as a supervised task. While these methods achieve competitive performance, they often lack sufficient robustness. When a suboptimal action is taken, the planner may encounter an out-of-distribution state, which can lead to task failure. In contrast, we frame the task as a Partially Observable Markov Decision Process (POMDP) and aim to develop a robust planner under a few-shot assumption. Thus, we propose a closed-loop planner with an adaptation module and a novel hindsight method, aiming to use as much information as possible to assist the planner. Our experiments on the ALFRED dataset indicate that our planner achieves competitive performance under a few-shot assumption. For the first time, our few-shot agent's performance approaches and even surpasses that of the full-shot supervised agent.

Problem

Research questions and friction points this paper is trying to address.

Robot Learning

Few-shot Demonstration

Decision-making under Uncertainty

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models

Few-shot Learning

Incomplete Information Decision-making

🔎 Similar Papers

Socratic Planner: Inquiry-Based Zero-Shot Planning for Embodied Instruction Following