Mitigating Cultural Bias in LLMs via Multi-Agent Cultural Debate

📅 2026-01-17

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This study addresses the pervasive Western-centric cultural bias in large language models and the inadequacy of existing approaches in effectively evaluating and mitigating cross-cultural biases. To this end, the authors propose MACD (Multi-Agent Cultural Debate), a training-free framework that explicitly models cultural identities and incorporates a “seeking common ground while respecting differences” negotiation strategy. They also introduce CEBiasBench, a bilingual Chinese–English evaluation benchmark, and a multi-agent voting mechanism (MAV). Experimental results show that MACD significantly improves the unbiased response rate on CEBiasBench from 47.6% (LLM-as-judge baseline) and 69.0% (MAV) to 57.6% and 86.0%, respectively. Furthermore, MACD demonstrates strong generalization on the Arabic CAMeL benchmark, overcoming the limitation of conventional role-playing agents that lack explicit cultural representation.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) exhibit systematic Western-centric bias, yet whether prompting in non-Western languages (e.g., Chinese) can mitigate this remains understudied. Answering this question requires rigorous evaluation and effective mitigation, but existing approaches fall short on both fronts: evaluation methods force outputs into predefined cultural categories without a neutral option, while mitigation relies on expensive multi-cultural corpora or agent frameworks that use functional roles (e.g., Planner--Critique) lacking explicit cultural representation. To address these gaps, we introduce CEBiasBench, a Chinese--English bilingual benchmark, and Multi-Agent Vote (MAV), which enables explicit ``no bias''judgments. Using this framework, we find that Chinese prompting merely shifts bias toward East Asian perspectives rather than eliminating it. To mitigate such persistent bias, we propose Multi-Agent Cultural Debate (MACD), a training-free framework that assigns agents distinct cultural personas and orchestrates deliberation via a"Seeking Common Ground while Reserving Differences"strategy. Experiments demonstrate that MACD achieves 57.6% average No Bias Rate evaluated by LLM-as-judge and 86.0% evaluated by MAV (vs. 47.6% and 69.0% baseline using GPT-4o as backbone) on CEBiasBench and generalizes to the Arabic CAMeL benchmark, confirming that explicit cultural representation in agent frameworks is essential for cross-cultural fairness.

Problem

Research questions and friction points this paper is trying to address.

Cultural Bias

Large Language Models

Cross-Cultural Fairness

Bias Mitigation

Multi-Agent Systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Agent Cultural Debate

Cultural Bias Mitigation

CEBiasBench