No Algorithmic Collusion in Two-Player Blindfolded Game with Thompson Sampling

📅 2024-05-23

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This paper investigates whether unintentional collusion arises when two players independently employ Thompson sampling in repeated blind games with unknown payoff matrices. Under mild assumptions on the payoff matrix, we prove that their strategies converge almost surely (with probability one) to a Nash equilibrium—thereby precluding algorithmic collusion. To establish this result, we develop the first sample-path convergence analysis framework applicable to settings with infrequent parameter updates and non-Lipschitz dynamics, extending beyond the scope of classical stochastic approximation methods. Integrating multi-armed bandit theory, Bayesian adaptive decision-making, and game-theoretic equilibrium analysis, we rigorously derive asymptotic rationality guarantees for Thompson sampling in non-cooperative stochastic games. Our theoretical findings provide foundational support for algorithmic fairness and interpretability in decentralized learning systems.

Technology Category

Application Category

📝 Abstract

When two players are engaged in a repeated game with unknown payoff matrices, they may be completely unaware of the existence of each other and use multi-armed bandit algorithms to choose the actions, which is referred to as the ``blindfolded game'' in this paper. We show that when the players use Thompson sampling, the game dynamics converges to the Nash equilibrium under a mild assumption on the payoff matrices. Therefore, algorithmic collusion doesn't arise in this case despite the fact that the players do not intentionally deploy competitive strategies. To prove the convergence result, we find that the framework developed in stochastic approximation doesn't apply, because of the sporadic and infrequent updates of the inferior actions and the lack of Lipschitz continuity. We develop a novel sample-path-wise approach to show the convergence.

Problem

Research questions and friction points this paper is trying to address.

Thompson Sampling's susceptibility to algorithmic collusion

Convergence conditions for Nash equilibrium in repeated games

Collusive outcomes when payoff matrix assumptions fail

Innovation

Methods, ideas, or system contributions that make the work stand out.

Thompson sampling prevents algorithmic collusion convergence

Novel sample-path-wise approach proves Nash equilibrium convergence

Collusive outcomes possible when payoff assumption violated

🔎 Similar Papers

Algorithmic Collusion And The Minimum Price Markov Game