Population-aware Online Mirror Descent for Mean-Field Games with Common Noise by Deep Reinforcement Learning

📅 2025-09-03

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This work addresses the Nash equilibrium learning problem in large-scale mean-field games (MFGs) under unknown initial state distributions and common noise. To overcome limitations of existing methods—such as poor generalization and unstable convergence—we propose a deep reinforcement learning framework that integrates Munchausen RL with online mirror descent (OMD), augmented with a population-aware mechanism. Crucially, our approach adapts to arbitrary initial distributions and common noise without requiring historical sampling or explicit averaging operations. It directly optimizes population-dependent policies. Evaluated across seven canonical MFG benchmarks, our method consistently outperforms state-of-the-art approaches: it accelerates convergence by up to 42%, maintains over 98% policy stability under strong common noise, and exhibits superior robustness and cross-distribution generalization.

Technology Category

Application Category

📝 Abstract

Mean Field Games (MFGs) offer a powerful framework for studying large-scale multi-agent systems. Yet, learning Nash equilibria in MFGs remains a challenging problem, particularly when the initial distribution is unknown or when the population is subject to common noise. In this paper, we introduce an efficient deep reinforcement learning (DRL) algorithm designed to achieve population-dependent Nash equilibria without relying on averaging or historical sampling, inspired by Munchausen RL and Online Mirror Descent. The resulting policy is adaptable to various initial distributions and sources of common noise. Through numerical experiments on seven canonical examples, we demonstrate that our algorithm exhibits superior convergence properties compared to state-of-the-art algorithms, particularly a DRL version of Fictitious Play for population-dependent policies. The performance in the presence of common noise underscores the robustness and adaptability of our approach.

Problem

Research questions and friction points this paper is trying to address.

Learning Nash equilibria in unknown initial distributions

Addressing common noise in large-scale multi-agent systems

Developing adaptable policies without historical sampling methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep reinforcement learning for Nash equilibria

Online Mirror Descent without historical sampling

Adaptable to various initial distributions and noise

🔎 Similar Papers

MF-OML: Online Mean-Field Reinforcement Learning with Occupation Measures for Large Population Games