ConventionPlay: Capability-Limited Training for Robust Ad-Hoc Collaboration

📅 2026-04-20

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This work addresses the challenge of coordination in ad hoc collaboration, where agents must proactively guide their partners toward optimal joint strategies rather than merely adapting passively when multiple conventions are viable. To this end, the authors propose ConventionPlay, a reinforcement learning approach that trains agents to interact with diverse, capability-constrained adaptive followers. The method enables agents to dynamically recognize partners’ behavioral patterns and opportunistically assume either leading or following roles as appropriate. ConventionPlay extends the cognitive hierarchy framework by incorporating a multi-agent training environment with bounded partner capabilities and an adaptive partner modeling mechanism, thereby endowing agents with dual coordination capacities—active guidance and flexible following. Experimental results demonstrate that the approach significantly improves collaborative efficiency in standard coordination tasks, particularly excelling in scenarios where the payoffs across different conventions vary substantially.

Technology Category

Application Category

📝 Abstract

Ad-hoc collaboration often relies on identifying and adhering to shared conventions. However, when partners can follow multiple conventions, agents must do more than simply adapt; they must actively steer the team toward the most effective joint strategy. We present ConventionPlay, a reinforcement learning-based approach that extends cognitive hierarchies to include a diverse population of adaptive followers. By training against partners with varied capability limits, our agent learns to probe its partner's repertoire, leading the team when possible and following when necessary. Our results in canonical coordination tasks show that ConventionPlay achieves superior coordination efficiency, particularly in settings where conventions have differentiated payoffs.

Problem

Research questions and friction points this paper is trying to address.

ad-hoc collaboration

conventions

coordination

multi-agent systems

reinforcement learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

ConventionPlay

ad-hoc collaboration

capability-limited training