Learning "Partner-Aware" Collaborators in Multi-Party Collaboration

📅 2025-10-25

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

This study addresses the limited responsiveness of large language models (LLMs) to human intervention in multi-agent collaborative tasks and their difficulty in achieving task consensus—i.e., alignment of shared grounding. To this end, we propose Interruptible Collaborative Role-playing (ICR), a novel algorithm that models collaborative dynamics as a two-player modification-action Markov decision process and integrates reinforcement learning with human feedback (RLHF) to enable partner-aware collaborative optimization for the first time. ICR empowers LLMs to proactively detect, internalize, and adapt to real-time human interventions, significantly accelerating consensus convergence and improving its quality across multi-turn interactions. Experimental results demonstrate that ICR outperforms baseline methods across diverse collaborative tasks, enhancing both solution-space diversity and task consistency. This work establishes a new paradigm for developing trustworthy, human-aligned, and collaboratively capable LLM agents.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) are increasingly bring deployed in agentic settings where they act as collaborators with humans. Therefore, it is increasingly important to be able to evaluate their abilities to collaborate effectively in multi-turn, multi-party tasks. In this paper, we build on the AI alignment and safe interruptability literature to offer novel theoretical insights on collaborative behavior between LLM-driven collaborator agents and an intervention agent. Our goal is to learn an ideal partner-aware collaborator that increases the group's common-ground (CG)-alignment on task-relevant propositions-by intelligently collecting information provided in interventions by a partner agent.We show how LLM agents trained using standard RLHF and related approaches are naturally inclined to ignore possibly well-meaning interventions, which makes increasing group common ground non-trivial in this setting. We employ a two-player Modified-Action MDP to examine this suboptimal behavior of standard AI agents, and propose Interruptible Collaborative Roleplayer (ICR)-a novel partner-aware learning algorithm to train CG-optimal collaborators. Experiments on multiple collaborative task environments show that ICR, on average, is more capable of promoting successful CG convergence and exploring more diverse solutions in such tasks.

Problem

Research questions and friction points this paper is trying to address.

Evaluating LLM collaboration abilities in multi-party multi-turn tasks

Addressing LLM agents ignoring interventions during collaborative tasks

Developing partner-aware algorithms for optimal common-ground alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed partner-aware learning algorithm ICR

Used Modified-Action MDP framework for analysis

Trained CG-optimal collaborators through interruptible roleplaying

🔎 Similar Papers

No similar papers found.