Multi-Persona Thinking for Bias Mitigation in Large Language Models

📅 2026-01-21

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

Large language models frequently generate harmful stereotypes and unfair outputs due to embedded social biases. This work proposes a Multi-Perspective Thinking (MPT) framework that, during inference, guides the model to engage in dialectical interactions from multiple opposing social identities alongside a neutral perspective, iteratively reflecting to explicitly surface and correct biases. The approach innovatively transforms the inherent vulnerability of role assignment into a debiasing strength, establishing the first inference-time debiasing strategy grounded in multi-perspective dialectical reasoning. Experimental results demonstrate that MPT significantly outperforms existing prompting methods across various open- and closed-source large language models, effectively reducing bias while preserving core reasoning capabilities.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) exhibit significant social biases that can perpetuate harmful stereotypes and unfair outcomes. In this paper, we propose Multi-Persona Thinking (MPT), a novel inference-time framework that leverages dialectical reasoning from multiple perspectives to reduce bias. MPT guides models to adopt contrasting social identities (e.g., male and female) along with a neutral viewpoint, and then engages these personas iteratively to expose and correct biases. Through a dialectical reasoning process, the framework transforms the potential weakness of persona assignment into a strength for bias mitigation. We evaluate MPT on two widely used bias benchmarks across both open-source and closed-source models of varying scales. Our results demonstrate substantial improvements over existing prompting-based strategies: MPT achieves the lowest bias while maintaining core reasoning ability.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

social bias

bias mitigation

stereotypes

unfair outcomes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Persona Thinking

bias mitigation

dialectical reasoning