Collaborating in Multi-Armed Bandits with Strategic Agents

📅 2026-05-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

213K/year
🤖 AI Summary
This work addresses the challenge of free-riding behavior in strategic agents within multi-agent Bayesian multi-armed bandit settings, which often impedes collaborative exploration. Focusing on long-term participation scenarios without monetary transfers, the paper introduces CAOS—the first incentive-compatible, purely information-sharing mechanism—that renders efficient collaboration a Nash equilibrium. By integrating game-theoretic principles with the Bayesian multi-armed bandit framework, CAOS sustains persistent cooperative exploration through information exchange alone. Both theoretical analysis and empirical experiments demonstrate that CAOS achieves regret performance nearly matching that of fully cooperative systems, establishing that purely informational incentives are sufficient to support highly effective multi-agent collaborative learning.
📝 Abstract
We study collaborative learning in multi-agent Bayesian bandit problems, where strategic agents collectively solve the same bandit instance. While multiple agents can accelerate learning by sharing information, strategic agents might prefer to free-ride and avoid exploration. We consider a setting with persistent agents that participate in multiple time periods. This is in contrast to most previous works on incentives in multi-agent MAB, which assume short-lived agents, namely each agent has a single decision to make and optimizes their expected reward in that single decision. As in the multi-agent MAB model with incentives, our model does not have monetary transfers, and the only incentives are through information sharing. We propose \texttt{CAOS}, a mechanism that sustains collaboration as a Nash equilibrium while achieving strong regret guarantees. Our results demonstrate that collaborative exploration can be sustained purely through information sharing, achieving performance close to that of fully cooperative systems despite strategic behavior.
Problem

Research questions and friction points this paper is trying to address.

Multi-Armed Bandits
Strategic Agents
Collaborative Learning
Information Sharing
Incentives
Innovation

Methods, ideas, or system contributions that make the work stand out.

strategic agents
collaborative learning
multi-armed bandits
information sharing
Nash equilibrium
🔎 Similar Papers
No similar papers found.