Mitigating Gender Bias via Fostering Exploratory Thinking in LLMs

📅 2025-05-22

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

To mitigate pervasive gender bias in large language models (LLMs), this paper proposes a novel debiasing framework centered on eliciting exploratory thinking. Methodologically, it constructs structurally symmetric, morally ambiguous story pairs featuring male and female protagonists to explicitly surface implicit biases via contrastive reasoning; introduces a self-correction mechanism to guide the model toward consistent, value-neutral moral judgments; and jointly applies direct preference optimization (DPO) and instruction tuning for alignment training. Its key contributions are: (i) the first integration of exploratory thinking modeling into bias mitigation pipelines, and (ii) fully automated generation of story pairs and preference learning—eliminating reliance on human annotations. Experiments demonstrate substantial reductions in bias scores across multiple gender bias benchmarks, while maintaining or improving performance on general reasoning, factual consistency, and language understanding tasks.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) often exhibit gender bias, resulting in unequal treatment of male and female subjects across different contexts. To address this issue, we propose a novel data generation framework that fosters exploratory thinking in LLMs. Our approach prompts models to generate story pairs featuring male and female protagonists in structurally identical, morally ambiguous scenarios, then elicits and compares their moral judgments. When inconsistencies arise, the model is guided to produce balanced, gender-neutral judgments. These story-judgment pairs are used to fine-tune or optimize the models via Direct Preference Optimization (DPO). Experimental results show that our method significantly reduces gender bias while preserving or even enhancing general model capabilities. We will release the code and generated data.

Problem

Research questions and friction points this paper is trying to address.

Mitigating gender bias in LLMs

Fostering exploratory thinking for balanced judgments

Reducing bias while preserving model capabilities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates story pairs with gender-balanced protagonists

Uses Direct Preference Optimization for fine-tuning

Reduces bias while maintaining model performance

🔎 Similar Papers

Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias