Cross-Language Bias Examination in Large Language Models

📅 2025-12-17

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This study systematically investigates cross-lingual disparities in bias exhibited by large language models (LLMs) across English, Chinese, Arabic, French, and Spanish. To address the lack of standardized multilingual bias evaluation, we propose the first integrated framework combining explicit bias measurement (via the BBQ benchmark) and implicit bias assessment (via prompt-driven Implicit Association Tests). Cross-lingual comparability is ensured through vocabulary-level translation alignment and quantitative contrastive analysis. Our findings reveal strong language-specificity in LLM bias: Arabic and Spanish exhibit the highest bias magnitudes, whereas Chinese and English show the lowest. Notably, age-related bias displays a “low-explicit, high-implicit” pattern—implicit bias substantially exceeds explicit manifestations. These results demonstrate that relying on single-language evaluations or exclusively explicit metrics severely underestimates bias risk. The work underscores the necessity—and introduces an innovative methodology—for multilingual, dual-path (explicit + implicit) bias assessment in LLMs.

Technology Category

Application Category

📝 Abstract

This study introduces an innovative multilingual bias evaluation framework for assessing bias in Large Language Models, combining explicit bias assessment through the BBQ benchmark with implicit bias measurement using a prompt-based Implicit Association Test. By translating the prompts and word list into five target languages, English, Chinese, Arabic, French, and Spanish, we directly compare different types of bias across languages. The results reveal substantial gaps in bias across languages used in LLMs. For example, Arabic and Spanish consistently show higher levels of stereotype bias, while Chinese and English exhibit lower levels of bias. We also identify contrasting patterns across bias types. Age shows the lowest explicit bias but the highest implicit bias, emphasizing the importance of detecting implicit biases that are undetectable with standard benchmarks. These findings indicate that LLMs vary significantly across languages and bias dimensions. This study fills a key research gap by providing a comprehensive methodology for cross-lingual bias analysis. Ultimately, our work establishes a foundation for the development of equitable multilingual LLMs, ensuring fairness and effectiveness across diverse languages and cultures.

Problem

Research questions and friction points this paper is trying to address.

Evaluates multilingual bias in LLMs using explicit and implicit tests

Compares bias across five languages revealing significant disparities

Identifies contrasting patterns between explicit and implicit bias types

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilingual bias evaluation framework combining explicit and implicit assessment

Translation of prompts into five languages for cross-lingual bias comparison

Detection of implicit biases undetectable with standard benchmarks

🔎 Similar Papers

A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers