Learning to Control Unknown Strongly Monotone Games

📅 2024-06-30

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This paper addresses strongly monotone games where a central coordinator lacks knowledge of players’ utility functions and strategy sets—a realistic setting in distributed systems with privacy or information constraints. Method: We propose an online learning algorithm that requires no access to players’ rewards or actions; instead, it leverages only feedback from linear constraint violations to dynamically adjust controllable parameters, enabling constraint-driven regulation toward Nash equilibria. Contribution/Results: To the best of our knowledge, this is the first method achieving global performance-targeted convergence to Nash equilibria under *completely unknown* game structure. We prove almost-sure convergence to the target-constrained equilibrium set and establish an $L_2$ convergence rate of $ ilde{O}(t^{-1/4})$. The approach integrates stochastic approximation, online optimization, monotone operator theory, and gradient-play stability analysis—ensuring both privacy preservation and engineering feasibility.

Technology Category

Application Category

📝 Abstract

Consider a game where the players' utility functions include a reward function and a linear term for each dimension, with coefficients that are controlled by the manager. We assume that the game is strongly monotone, so gradient play converges to a unique Nash equilibrium (NE). The NE is typically globally inefficient. The global performance at NE can be improved by imposing linear constraints on the NE. We therefore want the manager to pick the controlled coefficients that impose the desired constraint on the NE. However, this requires knowing the players' reward functions and action sets. Obtaining this game information is infeasible in a large-scale network and violates user privacy. To overcome this, we propose a simple algorithm that learns to shift the NE to meet the linear constraints by adjusting the controlled coefficients online. Our algorithm only requires the linear constraints violation as feedback and does not need to know the reward functions or the action sets. We prove that our algorithm converges with probability 1 to the set of NE that satisfy target linear constraints. We then prove an L2 convergence rate of near-$O(t^{-1/4})$.

Problem

Research questions and friction points this paper is trying to address.

Game Theory

Nash Equilibrium

Privacy Preservation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Nash Equilibrium Adjustment

Privacy-Preserving Optimization

Rule-Adaptive Gaming

🔎 Similar Papers

No similar papers found.