🤖 AI Summary
Prior research typically examines investor learning or heterogeneous preferences in isolation, leaving their joint effects and emergent mechanisms poorly understood. Method: We propose a multi-agent reinforcement learning framework that jointly models heterogeneous risk preferences, time discounting, and information acquisition capabilities, enabling adaptive behavioral evolution within a shared policy environment. Contribution/Results: Our key innovation is a “preference–learning” co-modeling paradigm, which uncovers, for the first time, a two-stage emergence mechanism: individual strategy divergence drives market-level phenomena—including volatility clustering and heavy-tailed return distributions—that align with empirical financial statistics. Experiments demonstrate that complex market dynamics—such as fat tails, long memory, and intermittent volatility—arise endogenously from preference-driven strategy diversity and agent interactions alone. This provides an interpretable, generative foundation for understanding collective behavior in financial systems.
📝 Abstract
Agent-based models help explain stock price dynamics as emergent phenomena driven by interacting investors. In this modeling tradition, investor behavior has typically been captured by two distinct mechanisms -- learning and heterogeneous preferences -- which have been explored as separate paradigms in prior studies. However, the impact of their joint modeling on the resulting collective dynamics remains largely unexplored. We develop a multi-agent reinforcement learning framework in which agents endowed with heterogeneous risk aversion, time discounting, and information access collectively learn trading strategies within a unified shared-policy framework. The experiment reveals that (i) learning with heterogeneous preferences drives agents to develop strategies aligned with their individual traits, fostering behavioral differentiation and niche specialization within the market, and (ii) the interactions by the differentiated agents are essential for the emergence of realistic market dynamics such as fat-tailed price fluctuations and volatility clustering. This study presents a constructive paradigm for financial market modeling in which the joint design of heterogeneous preferences and learning mechanisms enables two-stage emergence: individual behavior and the collective market dynamics.