MEMOA: Massive Mixtures of Online Agents via Mean-Field Decentralized Nash Equilibria

πŸ“… 2026-05-06
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

241K/year
πŸ€– AI Summary
This work addresses the scalability challenges of computational and communication overhead in large-scale federated learning with AI agents by proposing a mean-field-based decentralized learning strategy. In this framework, each agent makes decisions based solely on its own state and the population mean, while overall robustness is enhanced by minimizing the worst-performing client’s loss. The key contribution lies in the first closed-form derivation of an optimal decentralized policy, which is theoretically proven to converge to the centralized Nash equilibrium in the large-population limit. Furthermore, an online weighting mechanism is introduced to jointly improve global prediction accuracy and the performance of the weakest agent. Experimental results demonstrate that the proposed method substantially outperforms existing greedy decentralized baselines, achieving significant gains in both overall performance and fairness.
πŸ“ Abstract
In the modern age of large-scale AI, federated learning has become an increasingly important tool for training large populations of AI agents; however, its computational and communication costs can rapidly fail to scale with the number of agents. This is precisely where decentralized agentic strategies shine: each agent acts autonomously, using only its own state together with a minimal summary of the ensemble, namely the mean-field. We derive the unique optimal decentralized policy in closed form. Optimality is characterized through a worst-client/minimax criterion: minimizing the under-performer regret, namely the maximal online cost incurred by the weakest agent in the ensemble. We further prove that the resulting decentralized policy asymptotically converges, in the large-population limit, to the Nash-optimal centralized policy, whose direct computation is not scalable. We use an online weighting mechanism to optimize the server-computed mixture of client predictions, thereby improving the mean prediction in addition to the previously optimized weakest-client prediction. Numerical experiments verify our theoretical guarantees and demonstrate that our decentralized policy typically outperforms natural greedy decentralized baselines.
Problem

Research questions and friction points this paper is trying to address.

federated learning
decentralized agents
mean-field
Nash equilibrium
scalability
Innovation

Methods, ideas, or system contributions that make the work stand out.

decentralized learning
mean-field approximation
minimax optimization
Nash equilibrium
federated learning