Your AI Bosses Are Still Prejudiced: The Emergence of Stereotypes in LLM-Based Multi-Agent Systems

📅 2025-08-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether social stereotypes can spontaneously emerge in large language model (LLM)-driven multi-agent systems—without initial bias or reliance on biased training data. We construct a hierarchical workplace simulation environment and conduct multi-round interactive experiments, complemented by quantitative analysis. Results show that AI agents develop canonical social biases—including the halo effect, confirmation bias, and role-congruency bias—even under neutral initialization. Our key contributions are twofold: first, we provide the first empirical evidence that stereotypes arise as an *emergent property* of multi-agent interaction, consistently across diverse LLM architectures; second, we demonstrate that centralized decision-making authority and organizational hierarchy significantly amplify bias emergence. These findings indicate that sociocognitive biases may be intrinsically embedded in the coordination mechanisms of multi-agent systems, revealing a novel, architecture-level source of AI bias distinct from data- or model-level origins.

Technology Category

Application Category

📝 Abstract
While stereotypes are well-documented in human social interactions, AI systems are often presumed to be less susceptible to such biases. Previous studies have focused on biases inherited from training data, but whether stereotypes can emerge spontaneously in AI agent interactions merits further exploration. Through a novel experimental framework simulating workplace interactions with neutral initial conditions, we investigate the emergence and evolution of stereotypes in LLM-based multi-agent systems. Our findings reveal that (1) LLM-Based AI agents develop stereotype-driven biases in their interactions despite beginning without predefined biases; (2) stereotype effects intensify with increased interaction rounds and decision-making power, particularly after introducing hierarchical structures; (3) these systems exhibit group effects analogous to human social behavior, including halo effects, confirmation bias, and role congruity; and (4) these stereotype patterns manifest consistently across different LLM architectures. Through comprehensive quantitative analysis, these findings suggest that stereotype formation in AI systems may arise as an emergent property of multi-agent interactions, rather than merely from training data biases. Our work underscores the need for future research to explore the underlying mechanisms of this phenomenon and develop strategies to mitigate its ethical impacts.
Problem

Research questions and friction points this paper is trying to address.

Investigating stereotype emergence in LLM multi-agent systems
Exploring bias evolution without predefined training data biases
Analyzing stereotype amplification through hierarchical interaction structures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Simulated workplace interactions with neutral conditions
Observed stereotype emergence in multi-agent systems
Quantitative analysis across different LLM architectures
🔎 Similar Papers
No similar papers found.