Multiple-Debias: A Full-process Debiasing Method for Multilingual Pre-trained Language Models

📅 2026-04-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the pervasive biases related to gender, race, and religion in multilingual pre-trained language models by proposing the first end-to-end debiasing framework that integrates both preprocessing and postprocessing strategies. The approach combines multilingual counterfactual data augmentation, multilingual Self-Debias, and parameter-efficient fine-tuning. Furthermore, it extends the CrowS-Pairs benchmark for the first time to German, Spanish, Chinese, and Japanese. Experimental results demonstrate that the proposed method significantly reduces multiple forms of bias across all four languages, validating the effectiveness of cross-lingual debiasing information fusion. The framework consistently outperforms monolingual debiasing approaches, thereby achieving a systematic improvement in fairness for multilingual language models.
📝 Abstract
Multilingual Pre-trained Language Models (MPLMs) have become essential tools for natural language processing. However, they often exhibit biases related to sensitive attributes such as gender, race, and religion. In this paper, we introduce a comprehensive multilingual debiasing method named Multiple-Debias to address these issues across multiple languages. By incorporating multilingual counterfactual data augmentation and multilingual Self-Debias across both pre-processing and post-processing stages, alongside parameter-efficient fine-tuning, we significantly reduced biases in MPLMs across three sensitive attributes in four languages. We also extended CrowS-Pairs to German, Spanish, Chinese, and Japanese, validating our full-process multilingual debiasing method for gender, racial, and religious bias. Our experiments show that (i) multilingual debiasing methods surpass monolingual approaches in effectively mitigating biases, and (ii) integrating debiasing information from different languages notably improves the fairness of MPLMs.
Problem

Research questions and friction points this paper is trying to address.

multilingual pre-trained language models
bias
gender
race
religion
Innovation

Methods, ideas, or system contributions that make the work stand out.

multilingual debiasing
counterfactual data augmentation
Self-Debias
parameter-efficient fine-tuning
full-process debiasing
🔎 Similar Papers
No similar papers found.
H
Haoyu Liang
School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, China
P
Peijian Zeng
School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, China
Wentao Huang
Wentao Huang
California Institute of Technology
Coding TheoryCryptographySecurityInformation Theory
A
Aimin Yang
School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, China
Dong Zhou
Dong Zhou
Guangdong University of Foreign Studies
Artificial IntelligenceNatural Language ProcessingInformation Retrieval