Learning "Partner-Aware" Collaborators in Multi-Party Collaboration

πŸ“… 2025-10-25
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the limited responsiveness of large language models (LLMs) to human intervention in multi-agent collaborative tasks and their difficulty in achieving task consensusβ€”i.e., alignment of shared grounding. To this end, we propose Interruptible Collaborative Role-playing (ICR), a novel algorithm that models collaborative dynamics as a two-player modification-action Markov decision process and integrates reinforcement learning with human feedback (RLHF) to enable partner-aware collaborative optimization for the first time. ICR empowers LLMs to proactively detect, internalize, and adapt to real-time human interventions, significantly accelerating consensus convergence and improving its quality across multi-turn interactions. Experimental results demonstrate that ICR outperforms baseline methods across diverse collaborative tasks, enhancing both solution-space diversity and task consistency. This work establishes a new paradigm for developing trustworthy, human-aligned, and collaboratively capable LLM agents.

Technology Category

Application Category

πŸ“ Abstract
Large Language Models (LLMs) are increasingly bring deployed in agentic settings where they act as collaborators with humans. Therefore, it is increasingly important to be able to evaluate their abilities to collaborate effectively in multi-turn, multi-party tasks. In this paper, we build on the AI alignment and safe interruptability literature to offer novel theoretical insights on collaborative behavior between LLM-driven collaborator agents and an intervention agent. Our goal is to learn an ideal partner-aware collaborator that increases the group's common-ground (CG)-alignment on task-relevant propositions-by intelligently collecting information provided in interventions by a partner agent.We show how LLM agents trained using standard RLHF and related approaches are naturally inclined to ignore possibly well-meaning interventions, which makes increasing group common ground non-trivial in this setting. We employ a two-player Modified-Action MDP to examine this suboptimal behavior of standard AI agents, and propose Interruptible Collaborative Roleplayer (ICR)-a novel partner-aware learning algorithm to train CG-optimal collaborators. Experiments on multiple collaborative task environments show that ICR, on average, is more capable of promoting successful CG convergence and exploring more diverse solutions in such tasks.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLM collaboration abilities in multi-party multi-turn tasks
Addressing LLM agents ignoring interventions during collaborative tasks
Developing partner-aware algorithms for optimal common-ground alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed partner-aware learning algorithm ICR
Used Modified-Action MDP framework for analysis
Trained CG-optimal collaborators through interruptible roleplaying
πŸ”Ž Similar Papers
No similar papers found.