🤖 AI Summary
This study investigates the efficacy of large language model (LLM)-driven generative agent-based modeling (GABM) in simulating social norm enforcement behaviors—particularly third-party punishment. We propose a two-stage validation framework: first, replicating empirically established human behavior in social dilemmas (e.g., public goods games); second, generating novel, testable predictions. Our methodology integrates persona-based agent design, theory-of-mind mechanisms, comparative cognitive architecture analysis, and rumor diffusion simulation. Key findings indicate that individual heterogeneity and theory-of-mind capacity are necessary conditions for third-party punishment; significant punishment persists under anonymity, implicating both intrinsic moral motivation and reputation-based incentives; and open deliberation enhances public goods contributions and fosters the emergence of cooperative norms. These results establish a verifiable paradigm for LLM-augmented social computational modeling and yield theoretical insights into norm dynamics and collective behavior.
📝 Abstract
As large language models (LLMs) advance, there is growing interest in using them to simulate human social behavior through generative agent-based modeling (GABM). However, validating these models remains a key challenge. We present a systematic two-stage validation approach using social dilemma paradigms from psychological literature, first identifying the cognitive components necessary for LLM agents to reproduce known human behaviors in mixed-motive settings from two landmark papers, then using the validated architecture to simulate novel conditions. Our model comparison of different cognitive architectures shows that both persona-based individual differences and theory of mind capabilities are essential for replicating third-party punishment (TPP) as a costly signal of trustworthiness. For the second study on public goods games, this architecture is able to replicate an increase in cooperation from the spread of reputational information through gossip. However, an additional strategic component is necessary to replicate the additional boost in cooperation rates in the condition that allows both ostracism and gossip. We then test novel predictions for each paper with our validated generative agents. We find that TPP rates significantly drop in settings where punishment is anonymous, yet a substantial amount of TPP persists, suggesting that both reputational and intrinsic moral motivations play a role in this behavior. For the second paper, we introduce a novel intervention and see that open discussion periods before rounds of the public goods game further increase contributions, allowing groups to develop social norms for cooperation. This work provides a framework for validating generative agent models while demonstrating their potential to generate novel and testable insights into human social behavior.