Stereotype or Personalization? User Identity Biases Chatbot Recommendations

📅 2024-10-08
🏛️ arXiv.org
📈 Citations: 13
Influential: 1
📄 PDF
🤖 AI Summary
This study exposes implicit identity bias and transparency deficits in chatbot-based recommendation systems: mainstream consumer-grade large language models (LLMs)—GPT-4, Claude, and Gemini—unintentionally generate stereotyped recommendations (p < 0.001) when providing personalized suggestions to U.S. users across four racial groups, conditioned on race explicitly declared or implicitly inferred from prompts—yet none disclose this influence in their outputs. Using multi-turn prompt engineering and rigorously controlled experiments, the work provides the first systematic empirical validation of identity dependence in LLM-driven recommendations. Key contributions are: (1) demonstrating statistically significant and pervasive effects of identity features on recommendation outcomes; (2) identifying a critical gap in bias explainability within current systems; and (3) proposing the design principle that “identity influence must be explicitly annotated,” thereby advancing both theoretical foundations and practical pathways toward fair, transparent AI recommendation systems.

Technology Category

Application Category

📝 Abstract
While personalized recommendations are often desired by users, it can be difficult in practice to distinguish cases of bias from cases of personalization: we find that models generate racially stereotypical recommendations regardless of whether the user revealed their identity intentionally through explicit indications or unintentionally through implicit cues. We demonstrate that when people use large language models (LLMs) to generate recommendations, the LLMs produce responses that reflect both what the user wants and who the user is. We argue that chatbots ought to transparently indicate when recommendations are influenced by a user's revealed identity characteristics, but observe that they currently fail to do so. Our experiments show that even though a user's revealed identity significantly influences model recommendations (p<0.001), model responses obfuscate this fact in response to user queries. This bias and lack of transparency occurs consistently across multiple popular consumer LLMs and for four American racial groups.
Problem

Research questions and friction points this paper is trying to address.

LLMs produce racially stereotypical recommendations based on user identity
Chatbots lack transparency in revealing identity-influenced recommendations
User identity biases persist across multiple popular LLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Detects racial bias in chatbot recommendations
Analyzes explicit and implicit user identity cues
Proposes transparency for identity-influenced recommendations
🔎 Similar Papers
No similar papers found.