🤖 AI Summary
Large language models frequently generate harmful stereotypes and unfair outputs due to embedded social biases. This work proposes a Multi-Perspective Thinking (MPT) framework that, during inference, guides the model to engage in dialectical interactions from multiple opposing social identities alongside a neutral perspective, iteratively reflecting to explicitly surface and correct biases. The approach innovatively transforms the inherent vulnerability of role assignment into a debiasing strength, establishing the first inference-time debiasing strategy grounded in multi-perspective dialectical reasoning. Experimental results demonstrate that MPT significantly outperforms existing prompting methods across various open- and closed-source large language models, effectively reducing bias while preserving core reasoning capabilities.
📝 Abstract
Large Language Models (LLMs) exhibit significant social biases that can perpetuate harmful stereotypes and unfair outcomes. In this paper, we propose Multi-Persona Thinking (MPT), a novel inference-time framework that leverages dialectical reasoning from multiple perspectives to reduce bias. MPT guides models to adopt contrasting social identities (e.g., male and female) along with a neutral viewpoint, and then engages these personas iteratively to expose and correct biases. Through a dialectical reasoning process, the framework transforms the potential weakness of persona assignment into a strength for bias mitigation. We evaluate MPT on two widely used bias benchmarks across both open-source and closed-source models of varying scales. Our results demonstrate substantial improvements over existing prompting-based strategies: MPT achieves the lowest bias while maintaining core reasoning ability.