🤖 AI Summary
This work addresses the lack of efficient, low-variance algorithms supporting history-dependent policies in partially observable mean-field games. The authors propose a recurrent policy gradient method that integrates Monte Carlo sampling of common noise with exact expected return estimation. For the first time, this approach incorporates history-aware capabilities into a hybrid architecture by combining known environment dynamics, recurrent neural networks, and structured policy gradients, effectively handling complex scenarios involving heterogeneous agents, common noise, and history-dependent strategies. Implemented in the JAX-based MFAX framework, the method achieves state-of-the-art performance on a macroeconomic mean-field game benchmark, accelerating convergence by an order of magnitude and providing the first successful solution to this class of challenging problems.
📝 Abstract
Mean Field Games (MFGs) provide a principled framework for modeling interactions in large population models: at scale, population dynamics become deterministic, with uncertainty entering only through aggregate shocks, or common noise. However, algorithmic progress has been limited since model-free methods are too high variance and exact methods scale poorly. Recent Hybrid Structural Methods (HSMs) use Monte Carlo rollouts for the common noise in combination with exact estimation of the expected return, conditioned on those samples. However, HSMs have not been scaled to Partially Observable settings. We propose Recurrent Structural Policy Gradient (RSPG), the first history-aware HSM for settings involving public information. We also introduce MFAX, our JAX-based framework for MFGs. By leveraging known transition dynamics, RSPG achieves state-of-the-art performance as well as an order-of-magnitude faster convergence and solves, for the first time, a macroeconomics MFG with heterogeneous agents, common noise and history-aware policies. MFAX is publicly available at: https://github.com/CWibault/mfax.