🤖 AI Summary
This work addresses the problem of distributed best-arm identification in multi-armed bandits, where a central learner communicates instructions to agents over a noisy discrete memoryless channel, introducing communication uncertainty. For the first time, the study establishes a theoretical connection between this setting and the zero-error capacity of the communication channel. The authors propose a joint communication-and-decision-making strategy that adapts to varying agent capabilities. Leveraging information-theoretic analysis, the designed algorithm achieves efficient and robust best-arm identification while guaranteeing theoretical reliability under channel noise constraints.
📝 Abstract
In this paper, we consider a multi-armed bandit (MAB) instance and study how to identify the best arm when arm commands are conveyed from a central learner to a distributed agent over a discrete memoryless channel (DMC). Depending on the agent capabilities, we provide communication schemes along with their analysis, which interestingly relate to the zero-error capacity of the underlying DMC.