🤖 AI Summary
To address unsafe and non-compliant behaviors in LLM-driven robotic planning—caused by linguistic ambiguity and model hallucination—this paper proposes an introspective planning framework that aligns model uncertainty with task semantic ambiguity. Methodologically, it integrates large language model reasoning, retrieval-augmented generation (RAG), posterior rationalization, and evaluation on a novel safety-aware mobile manipulation benchmark. Key contributions include: (1) the first introspective reasoning knowledge base constructed from human-verified safe planning instances; and (2) the first integration of introspective planning with conformal prediction to jointly ensure statistical reliability and interactive efficiency. Evaluated across three tasks—including the newly introduced safety mobile manipulation benchmark—the approach significantly improves planning compliance and safety, yields tighter confidence intervals, reduces user clarification requests, and maintains theoretical success guarantees.
📝 Abstract
Large language models (LLMs) exhibit advanced reasoning skills, enabling robots to comprehend natural language instructions and strategically plan high-level actions through proper grounding. However, LLM hallucination may result in robots confidently executing plans that are misaligned with user goals or even unsafe in critical scenarios. Additionally, inherent ambiguity in natural language instructions can introduce uncertainty into the LLM's reasoning and planning processes.We propose introspective planning, a systematic approach that align LLM's uncertainty with the inherent ambiguity of the task. Our approach constructs a knowledge base containing introspective reasoning examples as post-hoc rationalizations of human-selected safe and compliant plans, which are retrieved during deployment. Evaluations on three tasks, including a newly introduced safe mobile manipulation benchmark, demonstrate that introspection substantially improves both compliance and safety over state-of-the-art LLM-based planning methods. Furthermore, we empirically show that introspective planning, in combination with conformal prediction, achieves tighter confidence bounds, maintaining statistical success guarantees while minimizing unnecessary user clarification requests. The webpage and code are accessible at https://introplan.github.io.