Truth or Tribe: How In-group Favoritism Prioritize Facts in Persona Agents

πŸ“… 2026-05-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

198K/year
πŸ€– AI Summary
This study investigates whether anthropomorphic agents exhibit ingroup bias under conflicting information and how such bias affects factual judgment. To address this, the authors introduce the β€œTruth or Tribe” simulation framework, employing a three-agent interaction paradigm and a defeasible reasoning task integrated with identity-based agent modeling. Their findings reveal, for the first time, that even in high cognitive complexity scenarios, agents consistently favor erroneous judgments from identity-similar peers over objective facts. To mitigate this bias, the study proposes three debiasing interventions: identity-blind instructions, structured counterfactual reasoning, and heterogeneous perspective integration. Experimental results demonstrate that these strategies effectively attenuate ingroup preference and enhance adherence to factual accuracy.
πŸ“ Abstract
In-group favoritism refers to the phenomena of favoring members of one's in-group over out-group members and is widely observed in numerous social cooperative behaviors. Recently, in-group favoritism biases have also been identified in generative language models. However, whether the in-group favoritism exists when persona agents are faced with contradicting information (e.g., misinformation), and how to mitigate the adverse effects of in-group favoritism biases in persona agents have been understudied. To address these problems, we propose a Truth or Tribe simulation framework to study the agent cooperation within the spread of contradicting information through a triadic interaction paradigm, and conduct controlled trials to evaluate the primary moderating factors. Extensive results showcase that persona agents display strong in-group favoritism, accepting incorrect answers from identity-similar peers at much higher rates than from dissimilar peers. In-group favoritism continues to emerge in defeasible reasoning contexts where no absolute truth exists, and it intensifies as cognitive complexity increases. Furthermore, three intervention strategies--Identity-Blind Instruction, Structured Counterfactual Reasoning, and Heterogeneous Perspective Ensemble--are proposed to mitigate the in-group favoritism.
Problem

Research questions and friction points this paper is trying to address.

in-group favoritism
persona agents
misinformation
bias mitigation
contradicting information
Innovation

Methods, ideas, or system contributions that make the work stand out.

in-group favoritism
persona agents
Truth or Tribe framework
counterfactual reasoning
bias mitigation
πŸ”Ž Similar Papers
No similar papers found.