Choose Your Battles: Distributed Learning Over Multiple Tug of War Games

📅 2025-09-24

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This paper addresses the distributed learning problem for $N$ players competing across $K$ parallel Tug-of-War (ToW) games: at each step, each player selects exactly one game, and its reward depends on the joint actions of all players in that game—forming a “Meta-ToW” system applicable to power control, task allocation, and sensor activation. To tackle high strategic coupling and communication constraints, we propose the Meta Tug-of-Peace algorithm, which enables decentralized game-switching decisions via low-frequency, 1-bit broadcast signals and updates action policies using stochastic approximation. We prove that the algorithm converges almost surely to an approximate Nash equilibrium satisfying prescribed quality-of-service (QoS) requirements. Extensive simulations demonstrate its efficacy in achieving system-wide equilibrium and QoS guarantees across diverse scenarios, while significantly reducing communication overhead compared to conventional approaches.

Technology Category

Application Category

📝 Abstract

Consider N players and K games taking place simultaneously. Each of these games is modeled as a Tug-of-War (ToW) game where increasing the action of one player decreases the reward for all other players. Each player participates in only one game at any given time. At each time step, a player decides the game in which they wish to participate in and the action they take in that game. Their reward depends on the actions of all players that are in the same game. This system of K games is termed `Meta Tug-of-War' (Meta-ToW) game. These games can model scenarios such as power control, distributed task allocation, and activation in sensor networks. We propose the Meta Tug-of-Peace algorithm, a distributed algorithm where the action updates are done using a simple stochastic approximation algorithm, and the decision to switch games is made using an infrequent 1-bit communication between the players. We prove that in Meta-ToW games, our algorithm converges to an equilibrium that satisfies a target Quality of Service reward vector for the players. We then demonstrate the efficacy of our algorithm through simulations for the scenarios mentioned above.

Problem

Research questions and friction points this paper is trying to address.

Distributed learning across multiple simultaneous Tug-of-War games

Players dynamically choose which game to participate in each step

Achieving target Quality of Service rewards with minimal communication

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributed algorithm with stochastic action updates

Infrequent 1-bit communication for game switching

Convergence to equilibrium satisfying target QoS

🔎 Similar Papers

Multi-Player Approaches for Dueling Bandits