How large language models judge and influence human cooperation

📅 2025-06-30

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Indirect reciprocity and reputation-based cooperation critically depend on third-party moral judgment, yet the impact of large language models (LLMs) acting as societal “judges” on long-term human cooperative evolution remains unexplored. Method: The authors develop an integrated framework combining 21 state-of-the-art LLMs with evolutionary game theory, standardized scenario-based evaluation, and goal-directed prompt engineering to assess LLM judgments across diverse cooperative contexts. Contribution/Results: While LLMs exhibit high inter-model agreement in evaluating general cooperation, they show substantial disagreement—particularly regarding cooperation with low-reputation agents—significantly altering simulated population-level cooperation trajectories. Crucially, the study introduces controllable prompting to modulate LLMs’ implicit social norm preferences, empirically demonstrating targeted amplification or suppression of cooperative behavior. These findings establish LLMs not merely as analytical tools but as emergent institutional actors capable of shaping the evolution of human cooperation.

Technology Category

Application Category

📝 Abstract

Humans increasingly rely on large language models (LLMs) to support decisions in social settings. Previous work suggests that such tools shape people's moral and political judgements. However, the long-term implications of LLM-based social decision-making remain unknown. How will human cooperation be affected when the assessment of social interactions relies on language models? This is a pressing question, as human cooperation is often driven by indirect reciprocity, reputations, and the capacity to judge interactions of others. Here, we assess how state-of-the-art LLMs judge cooperative actions. We provide 21 different LLMs with an extensive set of examples where individuals cooperate -- or refuse cooperating -- in a range of social contexts, and ask how these interactions should be judged. Furthermore, through an evolutionary game-theoretical model, we evaluate cooperation dynamics in populations where the extracted LLM-driven judgements prevail, assessing the long-term impact of LLMs on human prosociality. We observe a remarkable agreement in evaluating cooperation against good opponents. On the other hand, we notice within- and between-model variance when judging cooperation with ill-reputed individuals. We show that the differences revealed between models can significantly impact the prevalence of cooperation. Finally, we test prompts to steer LLM norms, showing that such interventions can shape LLM judgements, particularly through goal-oriented prompts. Our research connects LLM-based advices and long-term social dynamics, and highlights the need to carefully align LLM norms in order to preserve human cooperation.

Problem

Research questions and friction points this paper is trying to address.

How LLMs judge cooperative actions in social contexts

Impact of LLM-driven judgements on human cooperation dynamics

Aligning LLM norms to preserve human prosociality

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates cooperation using 21 LLMs

Uses evolutionary game-theoretical model

Tests prompts to steer LLM norms

🔎 Similar Papers

Large Language Models Overcome the Machine Penalty When Acting Fairly but Not When Acting Selfishly or Altruistically