Effects of Theory of Mind and Prosocial Beliefs on Steering Human-Aligned Behaviors of LLMs in Ultimatum Games

📅 2025-05-30

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

This study investigates how Theory of Mind (ToM) and prosocial beliefs enhance large language models’ (LLMs’) alignment with human norms in the Ultimatum Game. To address this, we propose an interpretable and controllable negotiation behavior generation framework that couples hierarchical ToM modeling (first- and second-order) with explicit moral disposition prompts (greedy, fair, or altruistic). We conduct over 2,700 automated experiments across models including o3-mini and DeepSeek-R1 Distilled Qwen 32B, integrating chain-of-thought reasoning and game-theoretic analysis. Our results demonstrate, for the first time, that explicit ToM augmentation significantly improves fairness proposal rates (+38%), decision stability (variance reduced by 52%), and overall utility. Crucially, second-order ToM exerts a causal effect on responder behavior. Moreover, we move beyond pure behavioral imitation by establishing a novel methodology that jointly models ToM and prosocial beliefs—thereby advancing normative alignment in LLM-based strategic interaction.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have shown potential in simulating human behaviors and performing theory-of-mind (ToM) reasoning, a crucial skill for complex social interactions. In this study, we investigate the role of ToM reasoning in aligning agentic behaviors with human norms in negotiation tasks, using the ultimatum game as a controlled environment. We initialized LLM agents with different prosocial beliefs (including Greedy, Fair, and Selfless) and reasoning methods like chain-of-thought (CoT) and varying ToM levels, and examined their decision-making processes across diverse LLMs, including reasoning models like o3-mini and DeepSeek-R1 Distilled Qwen 32B. Results from 2,700 simulations indicated that ToM reasoning enhances behavior alignment, decision-making consistency, and negotiation outcomes. Consistent with previous findings, reasoning models exhibit limited capability compared to models with ToM reasoning, different roles of the game benefits with different orders of ToM reasoning. Our findings contribute to the understanding of ToM's role in enhancing human-AI interaction and cooperative decision-making. The code used for our experiments can be found at https://github.com/Stealth-py/UltimatumToM.

Problem

Research questions and friction points this paper is trying to address.

Investigates ToM's role in aligning LLM behaviors with human norms

Examines prosocial beliefs' impact on negotiation decision-making

Assesses ToM reasoning's effect on behavior alignment and outcomes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using ToM reasoning for human-aligned LLM behaviors

Initializing agents with diverse prosocial beliefs

Testing decision-making across multiple LLM models

🔎 Similar Papers

Entering Real Social World! Benchmarking the Social Intelligence of Large Language Models from a First-person Perspective

2024-10-08Citations: 0