Long-horizon Embodied Planning with Implicit Logical Inference and Hallucination Mitigation

📅 2024-09-24

📈 Citations: 1

✨ Influential: 1

career value

195K/year

🤖 AI Summary

To address persistent logical errors and hallucinations in long-horizon embodied task planning—exacerbated by scarce high-quality demonstration data—this paper proposes ReLEP, a demonstration-free, real-time long-horizon planning framework. Methodologically, ReLEP introduces (i) a novel implicit logical reasoning modeling mechanism jointly optimized with hallucination suppression; (ii) a skill-functionalized planning paradigm, wherein a fine-tuned multimodal large language model learns the mapping from abstract instructions to executable action sequences; and (iii) a logic-aware synthetic data generation pipeline, augmented with a recallable memory module and a cross-platform robot configuration module for heterogeneous hardware adaptation. Evaluated across diverse long-horizon tasks, ReLEP significantly outperforms state-of-the-art methods, achieving high success rates and execution compliance on both seen and zero-shot tasks, while effectively mitigating logical inconsistencies and factual hallucinations.

Technology Category

Application Category

📝 Abstract

Long-horizon embodied planning underpins embodied AI. To accomplish long-horizon tasks, one of the most feasible ways is to decompose abstract instructions into a sequence of actionable steps. Foundation models still face logical errors and hallucinations in long-horizon planning, unless provided with highly relevant examples to the tasks. However, providing highly relevant examples for any random task is unpractical. Therefore, we present ReLEP, a novel framework for Real-time Long-horizon Embodied Planning. ReLEP can complete a wide range of long-horizon tasks without in-context examples by learning implicit logical inference through fine-tuning. The fine-tuned large vision-language model formulates plans as sequences of skill functions. These functions are selected from a carefully designed skill library. ReLEP is also equipped with a Memory module for plan and status recall, and a Robot Configuration module for versatility across robot types. In addition, we propose a data generation pipeline to tackle dataset scarcity. When constructing the dataset, we considered the implicit logical relationships, enabling the model to learn implicit logical relationships and dispel hallucinations. Through comprehensive evaluations across various long-horizon tasks, ReLEP demonstrates high success rates and compliance to execution even on unseen tasks and outperforms state-of-the-art baseline methods.

Problem

Research questions and friction points this paper is trying to address.

Decomposes abstract instructions into actionable steps

Mitigates logical errors and hallucinations in planning

Enables versatile planning across different robot types

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned vision-language model for planning

Memory module for plan and status recall

Data generation pipeline for implicit logic learning

🔎 Similar Papers

Deep hybrid models: infer and plan in a dynamic world