Plan-and-Act using Large Language Models for Interactive Agreement

📅 2025-04-01

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses the fundamental tension in LLM-driven robotics for dynamic human–robot interaction (HRI): balancing respect for the human’s ongoing activity with efficient task execution. We propose a temporally aware “plan–execute” skill framework featuring a novel two-stage LLM invocation mechanism: (1) an initial LLM call generates a high-level action plan; (2) a second, context-triggered invocation is autonomously scheduled based on real-time HRI state—specifically, the robot’s current action description—enabling adaptive switching between passive responsiveness and proactive intervention. The framework integrates prompt engineering, temporal reasoning, explicit HRI state modeling, and Engage skill composition. Evaluated across four heterogeneous real-world HRI scenarios, our approach achieves a 90% task success rate, demonstrating substantial improvements in behavioral appropriateness, timing accuracy, and cross-scenario generalizability of LLM-driven robotic agents.

Technology Category

Application Category

📝 Abstract

Recent large language models (LLMs) are capable of planning robot actions. In this paper, we explore how LLMs can be used for planning actions with tasks involving situational human-robot interaction (HRI). A key problem of applying LLMs in situational HRI is balancing between"respecting the current human's activity"and"prioritizing the robot's task,"as well as understanding the timing of when to use the LLM to generate an action plan. In this paper, we propose a necessary plan-and-act skill design to solve the above problems. We show that a critical factor for enabling a robot to switch between passive / active interaction behavior is to provide the LLM with an action text about the current robot's action. We also show that a second-stage question to the LLM (about the next timing to call the LLM) is necessary for planning actions at an appropriate timing. The skill design is applied to an Engage skill and is tested on four distinct interaction scenarios. We show that by using the skill design, LLMs can be leveraged to easily scale to different HRI scenarios with a reasonable success rate reaching 90% on the test scenarios.

Problem

Research questions and friction points this paper is trying to address.

Balancing human activity respect and robot task priority in HRI

Determining optimal timing for LLM-based action planning

Scaling LLM application across diverse human-robot interaction scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs plan robot actions dynamically

Action text enables behavior switching

Second-stage question optimizes timing

🔎 Similar Papers

No similar papers found.