REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?

📅 2025-05-16

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Ambiguous referring expressions (REs) in natural-language commands issued by non-expert users—such as elderly individuals and children—severely degrade the task-planning performance of LLM-driven robots. Method: We propose a systematic solution comprising three components: (1) REI-Bench, the first benchmark dedicated to evaluating robot task planning under ambiguous REs; (2) a novel task-oriented contextual cognition framework featuring context-aware instruction rewriting to automatically transform ambiguous commands into unambiguous, executable forms; and (3) integration of prompt engineering with chain-of-thought contrastive reasoning within the LLM planning pipeline. Contribution/Results: Experiments show that ambiguous REs reduce task success rates by up to 77.9%. Our approach significantly outperforms existing prompt-optimization and chain-of-thought methods, achieving state-of-the-art performance on REI-Bench.

Technology Category

Application Category

📝 Abstract

Robot task planning decomposes human instructions into executable action sequences that enable robots to complete a series of complex tasks. Although recent large language model (LLM)-based task planners achieve amazing performance, they assume that human instructions are clear and straightforward. However, real-world users are not experts, and their instructions to robots often contain significant vagueness. Linguists suggest that such vagueness frequently arises from referring expressions (REs), whose meanings depend heavily on dialogue context and environment. This vagueness is even more prevalent among the elderly and children, who robots should serve more. This paper studies how such vagueness in REs within human instructions affects LLM-based robot task planning and how to overcome this issue. To this end, we propose the first robot task planning benchmark with vague REs (REI-Bench), where we discover that the vagueness of REs can severely degrade robot planning performance, leading to success rate drops of up to 77.9%. We also observe that most failure cases stem from missing objects in planners. To mitigate the REs issue, we propose a simple yet effective approach: task-oriented context cognition, which generates clear instructions for robots, achieving state-of-the-art performance compared to aware prompt and chains of thought. This work contributes to the research community of human-robot interaction (HRI) by making robot task planning more practical, particularly for non-expert users, e.g., the elderly and children.

Problem

Research questions and friction points this paper is trying to address.

Evaluating how vague human instructions affect robot task planning performance

Addressing vagueness in referring expressions for non-expert users like elderly and children

Proposing a benchmark and solution to improve robot planning with unclear instructions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces REI-Bench for vague instruction evaluation

Proposes task-oriented context cognition method

Improves robot planning for non-expert users

🔎 Similar Papers

No similar papers found.

Authors to Follow