Learning Mean-Field Games through Mean-Field Actor-Critic Flow

📅 2025-10-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of computing equilibria in mean-field games (MFGs). We propose a novel continuous-time learning dynamics framework—Mean-Field Actor-Critic Flow—that tightly integrates reinforcement learning with optimal transport theory. Specifically, the state distribution evolves along Wasserstein-2 geodesics via an optimal transport Picard flow, while policy, value function, and distribution are updated synchronously. The gradient dynamics are modeled by a coupled PDE system, and convergence is rigorously established using a Lyapunov functional. We prove global exponential convergence under appropriate time-scale separation. Numerical experiments demonstrate the method’s efficiency and robustness across diverse MFG benchmarks. Our key contribution is the first incorporation of geodesic optimization into the Actor-Critic paradigm, thereby explicitly characterizing and jointly regulating the dynamic coupling among policy, value function, and state distribution.

Technology Category

Application Category

📝 Abstract
We propose the Mean-Field Actor-Critic (MFAC) flow, a continuous-time learning dynamics for solving mean-field games (MFGs), combining techniques from reinforcement learning and optimal transport. The MFAC framework jointly evolves the control (actor), value function (critic), and distribution components through coupled gradient-based updates governed by partial differential equations (PDEs). A central innovation is the Optimal Transport Geodesic Picard (OTGP) flow, which drives the distribution toward equilibrium along Wasserstein-2 geodesics. We conduct a rigorous convergence analysis using Lyapunov functionals and establish global exponential convergence of the MFAC flow under a suitable timescale. Our results highlight the algorithmic interplay among actor, critic, and distribution components. Numerical experiments illustrate the theoretical findings and demonstrate the effectiveness of the MFAC framework in computing MFG equilibria.
Problem

Research questions and friction points this paper is trying to address.

Solving mean-field games using reinforcement learning and optimal transport
Developing continuous-time learning dynamics for MFG equilibrium computation
Analyzing convergence of coupled actor-critic-distribution updates in MFGs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Continuous-time learning dynamics for mean-field games
Coupled gradient updates via partial differential equations
Distribution evolution along Wasserstein-2 geodesics
🔎 Similar Papers
No similar papers found.
M
Mo Zhou
Department of Mathematics, University of California, Los Angeles, CA 90095
H
Haosheng Zhou
Department of Statistics and Applied Probability, University of California, Santa Barbara, CA 93106-3110
Ruimeng Hu
Ruimeng Hu
Associate Professor, University of California, Santa Barbara
Financial MathematicsDeep Learning