Towards Reliable LLM-based Robot Planning via Combined Uncertainty Estimation

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) exhibit hallucination in robotic planning, yielding high-confidence yet unsafe decisions—primarily because existing methods fail to disentangle epistemic uncertainty (arising from task clarity and familiarity) from aleatoric uncertainty. Method: We propose the first joint uncertainty estimation framework that orthogonally decomposes epistemic uncertainty into task clarity and task familiarity dimensions, jointly modeling them with aleatoric uncertainty. Our approach employs stochastic network distillation to capture environmental dynamics uncertainty and integrates an LLM-feature-driven multi-layer perceptron regression head to separately quantify all three uncertainty types. Contribution/Results: Evaluated on kitchen manipulation and tabletop rearrangement tasks, our method significantly improves the calibration between uncertainty estimates and actual execution success—reducing mean calibration error by 32.7%—thereby enabling safer, more robust LLM-based planning decisions.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) demonstrate advanced reasoning abilities, enabling robots to understand natural language instructions and generate high-level plans with appropriate grounding. However, LLM hallucinations present a significant challenge, often leading to overconfident yet potentially misaligned or unsafe plans. While researchers have explored uncertainty estimation to improve the reliability of LLM-based planning, existing studies have not sufficiently differentiated between epistemic and intrinsic uncertainty, limiting the effectiveness of uncertainty esti- mation. In this paper, we present Combined Uncertainty estimation for Reliable Embodied planning (CURE), which decomposes the uncertainty into epistemic and intrinsic uncertainty, each estimated separately. Furthermore, epistemic uncertainty is subdivided into task clarity and task familiarity for more accurate evaluation. The overall uncertainty assessments are obtained using random network distillation and multi-layer perceptron regression heads driven by LLM features. We validated our approach in two distinct experimental settings: kitchen manipulation and tabletop rearrangement experiments. The results show that, compared to existing methods, our approach yields uncertainty estimates that are more closely aligned with the actual execution outcomes.
Problem

Research questions and friction points this paper is trying to address.

Addresses LLM hallucinations causing unsafe robot plans
Differentiates epistemic and intrinsic uncertainty in planning
Improves uncertainty estimation alignment with execution outcomes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposes uncertainty into epistemic and intrinsic types
Subdivides epistemic uncertainty into task clarity and familiarity
Uses random network distillation and MLP regression for estimation
🔎 Similar Papers
S
Shiyuan Yin
School of Artificial Intelligence, Henan University of Technology
Chenjia Bai
Chenjia Bai
Institute of Artificial Intelligence, China Telecom(中国电信人工智能研究院, TeleAI)
Reinforcement LearningRoboticsEmbodied AI
Zihao Zhang
Zihao Zhang
天津大学
计算机视觉
J
Junwei Jin
School of Artificial Intelligence, Henan University of Technology
Xinxin Zhang
Xinxin Zhang
Department of Electrical Engineering, Technical University of Denmark
Functional ModellingArtificial IntelligenceAlarm Design
C
Chi Zhang
Institute of Artificial Intelligence (TeleAI), China Telecom
X
Xuelong Li
Institute of Artificial Intelligence (TeleAI), China Telecom