Dual Debiasing: Remove Stereotypes and Keep Factual Gender for Fair Language Modeling and Translation

📅 2025-01-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Language models and machine translation systems often exhibit gender stereotyping biases, compromising fairness and factual accuracy. Method: This paper proposes a dual-path debiasing approach that jointly optimizes fairness and factual fidelity. Its core innovation is a novel dual-path disentanglement mechanism: one path eliminates spurious gender-stereotypical associations, while the other explicitly preserves genuine gender-relevant semantic cues. We further introduce 2DAMA—a model-adaptive, two-stage fine-tuning framework integrating adversarial learning with controllable attribute masking—to enable interpretable gender bias mitigation in translation. Contribution/Results: Evaluated on multiple English bias benchmarks, our method reduces stereotypical bias by an average of 62%, with negligible BLEU degradation (<0.3), and maintains stable downstream task performance. To our knowledge, this is the first work to achieve interpretable, disentangled, and factually grounded gender debiasing in neural machine translation.

Technology Category

Application Category

📝 Abstract
Mitigation of biases, such as language models' reliance on gender stereotypes, is a crucial endeavor required for the creation of reliable and useful language technology. The crucial aspect of debiasing is to ensure that the models preserve their versatile capabilities, including their ability to solve language tasks and equitably represent various genders. To address this issue, we introduce a streamlined Dual Dabiasing Algorithm through Model Adaptation (2DAMA). Novel Dual Debiasing enables robust reduction of stereotypical bias while preserving desired factual gender information encoded by language models. We show that 2DAMA effectively reduces gender bias in English and is one of the first approaches facilitating the mitigation of stereotypical tendencies in translation. The proposed method's key advantage is the preservation of factual gender cues, which are useful in a wide range of natural language processing tasks.
Problem

Research questions and friction points this paper is trying to address.

Bias Mitigation
Gender Stereotypes
Language Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

2DAMA
Bias Reduction
Gender Information Preservation
🔎 Similar Papers
No similar papers found.
T
Tomasz Limisiewicz
Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
David Mareček
David Mareček
Institute of Formal and Applied Linguistics, Charles University in Prague
T
Tomáš Musil
Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic