🤖 AI Summary
Modeling dynamic strategic interactions between governments and large populations of heterogeneous agents in macroeconomics remains challenging—conventional approaches fail to simultaneously capture dynamic feedback, leader–follower asymmetry, and scalability. To address this, we propose the Dynamic Stackelberg Mean-Field Game (DSMFG) framework, the first to formalize macroeconomic policy design as a time-evolving, closed-loop Dynamic Stackelberg game. We further develop the Stackelberg Mean-Field Reinforcement Learning (SMFRL) algorithm, enabling joint, data-driven optimization of optimal policy design and agent-specific behavioral responses. Empirical evaluation on a thousand-agent synthetic economy demonstrates that our approach increases GDP by up to 4× over classical methods and 19× over the 2022 U.S. federal income tax policy. Moreover, our framework achieves a 10× improvement in scalability relative to prior work.
📝 Abstract
Macroeconomic outcomes emerge from individuals' decisions, making it essential to model how agents interact with macro policy via consumption, investment, and labor choices. We formulate this as a dynamic Stackelberg game: the government (leader) sets policies, and agents (followers) respond by optimizing their behavior over time. Unlike static models, this dynamic formulation captures temporal dependencies and strategic feedback critical to policy design. However, as the number of agents increases, explicitly simulating all agent-agent and agent-government interactions becomes computationally infeasible. To address this, we propose the Dynamic Stackelberg Mean Field Game (DSMFG) framework, which approximates these complex interactions via agent-population and government-population couplings. This approximation preserves individual-level feedback while ensuring scalability, enabling DSMFG to jointly model three core features of real-world policymaking: dynamic feedback, asymmetry, and large scale. We further introduce Stackelberg Mean Field Reinforcement Learning (SMFRL), a data-driven algorithm that learns the leader's optimal policies while maintaining personalized responses for individual agents. Empirically, we validate our approach in a large-scale simulated economy, where it scales to 1,000 agents (vs. 100 in prior work) and achieves a fourfold increase in GDP over classical economic methods and a nineteenfold improvement over the static 2022 U.S. federal income tax policy.