🤖 AI Summary
This study addresses the challenge of achieving multi-agent coordination and cooperation in social dilemmas through evolvable, decentralized social norms. Building upon a Markov game framework that integrates evolutionary game theory and reinforcement learning, the work proposes modeling social norms as individual response mechanisms to stochastic environmental signals and establishes the conditions under which such norms align with single-agent optimal policies. The key contribution lies in extending norm evolution from static matrix games to dynamic Markov settings, fully characterizing the fitness landscape of norms under replicator dynamics. The analysis reveals that the system converges to correlated equilibria rather than Nash equilibria and demonstrates—through visualization—the evolutionary trajectories and stability properties of norms in the signal–reward space.
📝 Abstract
By specifying behaviour across multiple agents, social norms are a coordination approach to resolving social dilemmas. Decentralized and wide adoption can be achieved by norms whose prescription involves interpreting stochastic signals in the environment. Such signals must have enough correlation to orchestrate mutually beneficial coordination and enough disincentivizing uncertainty about the benefits of exploiting that coordination. Evolutionary game theory of matrix games has been used to describe how, by rational agents comparing and adopting norms, a norm can evolve to become dominant in a population. Morsky \& Akçay (2019) classify norms according to a set of rationality criteria. Joint player strategies that adopt norms that are consistent with optimal single-player strategies with respect to expected reward naturally satisfy a correlated, rather than Nash game theoretic equilibrium condition. Here, we present a version of this theory that clarifies the basic ingredients. We formulate it in the more general Markov game setting more commonly used in reinforcement learning theory. We illustrate the theory by mapping norms over the signal and reward space, while also giving a detailed exposition of the underlying mechanics of the approach. Finally, we give a general solution and analysis of replicator dynamics, which Morsky \& Akçay (2019) propose as a means by which these norms could emerge.