π€ AI Summary
This study investigates whether anthropomorphic agents exhibit ingroup bias under conflicting information and how such bias affects factual judgment. To address this, the authors introduce the βTruth or Tribeβ simulation framework, employing a three-agent interaction paradigm and a defeasible reasoning task integrated with identity-based agent modeling. Their findings reveal, for the first time, that even in high cognitive complexity scenarios, agents consistently favor erroneous judgments from identity-similar peers over objective facts. To mitigate this bias, the study proposes three debiasing interventions: identity-blind instructions, structured counterfactual reasoning, and heterogeneous perspective integration. Experimental results demonstrate that these strategies effectively attenuate ingroup preference and enhance adherence to factual accuracy.
π Abstract
In-group favoritism refers to the phenomena of favoring members of one's in-group over out-group members and is widely observed in numerous social cooperative behaviors. Recently, in-group favoritism biases have also been identified in generative language models. However, whether the in-group favoritism exists when persona agents are faced with contradicting information (e.g., misinformation), and how to mitigate the adverse effects of in-group favoritism biases in persona agents have been understudied. To address these problems, we propose a Truth or Tribe simulation framework to study the agent cooperation within the spread of contradicting information through a triadic interaction paradigm, and conduct controlled trials to evaluate the primary moderating factors. Extensive results showcase that persona agents display strong in-group favoritism, accepting incorrect answers from identity-similar peers at much higher rates than from dissimilar peers. In-group favoritism continues to emerge in defeasible reasoning contexts where no absolute truth exists, and it intensifies as cognitive complexity increases. Furthermore, three intervention strategies--Identity-Blind Instruction, Structured Counterfactual Reasoning, and Heterogeneous Perspective Ensemble--are proposed to mitigate the in-group favoritism.