Mixed-Initiative Dialog for Human-Robot Collaborative Manipulation

📅 2025-08-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the challenge of dynamically evolving human partners’ behavior, intent, and cognition in long-term human-robot collaboration, this paper proposes MICoBot—a novel framework that introduces hybrid active dialogue to collaborative manipulation for the first time, enabling bidirectional request initiation and multi-turn task allocation negotiation between human and robot. MICoBot integrates a large language model (LLM), a simulation-pretrained operability model, human assistance availability estimation, and a hierarchical planning mechanism to jointly realize three-tier decision-making: collaborative strategy generation, task allocation optimization, and action execution control. A 27-hour user study (N=18) conducted on a physical robot platform demonstrates that MICoBot achieves a 23.6% higher task success rate compared to an LLM-only baseline and other allocation methods, while significantly improving user satisfaction. These results empirically validate the critical role of dynamic negotiation in enhancing collaboration flexibility and robustness.

Technology Category

Application Category

📝 Abstract
Effective robotic systems for long-horizon human-robot collaboration must adapt to a wide range of human partners, whose physical behavior, willingness to assist, and understanding of the robot's capabilities may change over time. This demands a tightly coupled communication loop that grants both agents the flexibility to propose, accept, or decline requests as they coordinate toward completing the task effectively. We apply a Mixed-Initiative dialog paradigm to Collaborative human-roBot teaming and propose MICoBot, a system that handles the common scenario where both agents, using natural language, take initiative in formulating, accepting, or rejecting proposals on who can best complete different steps of a task. To handle diverse, task-directed dialog, and find successful collaborative strategies that minimize human effort, MICoBot makes decisions at three levels: (1) a meta-planner considers human dialog to formulate and code a high-level collaboration strategy, (2) a planner optimally allocates the remaining steps to either agent based on the robot's capabilities (measured by a simulation-pretrained affordance model) and the human's estimated availability to help, and (3) an action executor decides the low-level actions to perform or words to say to the human. Our extensive evaluations in simulation and real-world -- on a physical robot with 18 unique human participants over 27 hours -- demonstrate the ability of our method to effectively collaborate with diverse human users, yielding significantly improved task success and user experience than a pure LLM baseline and other agent allocation models. See additional videos and materials at https://robin-lab.cs.utexas.edu/MicoBot/.
Problem

Research questions and friction points this paper is trying to address.

Adapting robots to diverse human partners in collaboration
Balancing task proposals and decisions via natural language dialog
Optimizing human-robot task allocation to minimize human effort
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixed-Initiative dialog for collaborative human-robot teaming
Three-level decision-making for task allocation
Simulation-pretrained affordance model for capability measurement
🔎 Similar Papers
No similar papers found.