🤖 AI Summary
This work identifies a cultural positioning bias in large language models (LLMs): their default generation perspective treats mainstream U.S. culture as the “in-group” while framing non-dominant cultures as “out-groups,” undermining generative fairness. To address this, the authors introduce a cross-cultural interview generation task and propose CultureLens, a novel benchmark for evaluating cultural positioning bias. They further design two inference-time debiasing methods: Fairness Intervention via Prompting (FIP), a prompt-based approach, and the Multi-Agent Fairness Alignment (MFA) framework—an agent-based architecture supporting either single-agent introspective rewriting or multi-agent collaborative reconstruction. Evaluated on 4,000 prompts using three quantitative metrics, experiments reveal that mainstream LLMs adopt an internal perspective in over 88% of U.S.-context prompts but predominantly employ an external perspective for marginalized cultures. MFA significantly reduces bias, demonstrating the efficacy of agent-based architectures in enhancing cultural fairness.
📝 Abstract
Large language models (LLMs) have unlocked a wide range of downstream generative applications. However, we found that they also risk perpetuating subtle fairness issues tied to culture, positioning their generations from the perspectives of the mainstream US culture while demonstrating salient externality towards non-mainstream ones. In this work, we identify and systematically investigate this novel culture positioning bias, in which an LLM's default generative stance aligns with a mainstream view and treats other cultures as outsiders. We propose the CultureLens benchmark with 4000 generation prompts and 3 evaluation metrics for quantifying this bias through the lens of a culturally situated interview script generation task, in which an LLM is positioned as an onsite reporter interviewing local people across 10 diverse cultures. Empirical evaluation on 5 state-of-the-art LLMs reveals a stark pattern: while models adopt insider tones in over 88 percent of US-contexted scripts on average, they disproportionately adopt mainly outsider stances for less dominant cultures. To resolve these biases, we propose 2 inference-time mitigation methods: a baseline prompt-based Fairness Intervention Pillars (FIP) method, and a structured Mitigation via Fairness Agents (MFA) framework consisting of 2 pipelines: (1) MFA-SA (Single-Agent) introduces a self-reflection and rewriting loop based on fairness guidelines. (2) MFA-MA (Multi-Agent) structures the process into a hierarchy of specialized agents: a Planner Agent(initial script generation), a Critique Agent (evaluates initial script against fairness pillars), and a Refinement Agent (incorporates feedback to produce a polished, unbiased script). Empirical results showcase the effectiveness of agent-based methods as a promising direction for mitigating biases in generative LLMs.