Towards Reliable LLM-based Robot Planning via Combined Uncertainty Estimation

📅 2025-10-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Large language models (LLMs) exhibit hallucination in robotic planning, yielding high-confidence yet unsafe decisions—primarily because existing methods fail to disentangle epistemic uncertainty (arising from task clarity and familiarity) from aleatoric uncertainty. Method: We propose the first joint uncertainty estimation framework that orthogonally decomposes epistemic uncertainty into task clarity and task familiarity dimensions, jointly modeling them with aleatoric uncertainty. Our approach employs stochastic network distillation to capture environmental dynamics uncertainty and integrates an LLM-feature-driven multi-layer perceptron regression head to separately quantify all three uncertainty types. Contribution/Results: Evaluated on kitchen manipulation and tabletop rearrangement tasks, our method significantly improves the calibration between uncertainty estimates and actual execution success—reducing mean calibration error by 32.7%—thereby enabling safer, more robust LLM-based planning decisions.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) demonstrate advanced reasoning abilities, enabling robots to understand natural language instructions and generate high-level plans with appropriate grounding. However, LLM hallucinations present a significant challenge, often leading to overconfident yet potentially misaligned or unsafe plans. While researchers have explored uncertainty estimation to improve the reliability of LLM-based planning, existing studies have not sufficiently differentiated between epistemic and intrinsic uncertainty, limiting the effectiveness of uncertainty esti- mation. In this paper, we present Combined Uncertainty estimation for Reliable Embodied planning (CURE), which decomposes the uncertainty into epistemic and intrinsic uncertainty, each estimated separately. Furthermore, epistemic uncertainty is subdivided into task clarity and task familiarity for more accurate evaluation. The overall uncertainty assessments are obtained using random network distillation and multi-layer perceptron regression heads driven by LLM features. We validated our approach in two distinct experimental settings: kitchen manipulation and tabletop rearrangement experiments. The results show that, compared to existing methods, our approach yields uncertainty estimates that are more closely aligned with the actual execution outcomes.

Problem

Research questions and friction points this paper is trying to address.

Addresses LLM hallucinations causing unsafe robot plans

Differentiates epistemic and intrinsic uncertainty in planning

Improves uncertainty estimation alignment with execution outcomes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposes uncertainty into epistemic and intrinsic types

Subdivides epistemic uncertainty into task clarity and familiarity

Uses random network distillation and MLP regression for estimation

🔎 Similar Papers

InteLiPlan: Interactive Lightweight LLM-Based Planner for Domestic Robot Autonomy

2024-09-22arXiv.orgCitations: 1

Introspective Planning: Aligning Robots' Uncertainty with Inherent Task Ambiguity

2024-02-09Neural Information Processing SystemsCitations: 0

Authors to Follow