🤖 AI Summary
This paper investigates the convergence properties of regularized learning in continuous-time multi-agent games under uncertainty. Addressing settings with incomplete information and stochastic feedback, we formulate both continuous- and discrete-time stochastic regularized learning dynamics, analyzing their long-term behavior via stochastic differential equations, discrete stochastic processes, and concentration inequalities. We prove that, in strongly monotone games, although learning trajectories do not converge almost surely to the Nash equilibrium, they exhibit finite-time recurrence to an equilibrium neighborhood, and their long-run empirical distribution concentrates tightly around the equilibrium—quantified precisely via tail bounds on the stationary distribution. In contrast, these properties generally fail in non-strongly-monotone games. Our results delineate the fundamental limits of regularized learning under persistent stochastic perturbations, providing novel theoretical foundations and sharp criteria for stability analysis in multi-agent learning systems.
📝 Abstract
In this paper, we examine the convergence landscape of multi-agent learning under uncertainty. Specifically, we analyze two stochastic models of regularized learning in continuous games -- one in continuous and one in discrete time with the aim of characterizing the long-run behavior of the induced sequence of play. In stark contrast to deterministic, full-information models of learning (or models with a vanishing learning rate), we show that the resulting dynamics do not converge in general. In lieu of this, we ask instead which actions are played more often in the long run, and by how much. We show that, in strongly monotone games, the dynamics of regularized learning may wander away from equilibrium infinitely often, but they always return to its vicinity in finite time (which we estimate), and their long-run distribution is sharply concentrated around a neighborhood thereof. We quantify the degree of this concentration, and we show that these favorable properties may all break down if the underlying game is not strongly monotone -- underscoring in this way the limits of regularized learning in the presence of persistent randomness and uncertainty.