Prompting Techniques for Reducing Social Bias in LLMs through System 1 and System 2 Cognitive Processes

📅 2024-04-26

🏛️ arXiv.org

📈 Citations: 22

✨ Influential: 1

career value

215K/year

🤖 AI Summary

This study addresses social bias in large language models (LLMs) by leveraging dual-process cognitive theory—distinguishing System 1 (intuitive, automatic) from System 2 (deliberative, reflective)—to design a novel prompting framework targeting nine categories of social bias. Methodologically, it integrates human role modeling, debiasing prompts, and System 1/2 mechanisms in a synergistic, empirically validated architecture. Results demonstrate that effective bias mitigation hinges on eliciting System 2 reasoning rather than relying on explicit persona assignment. Evaluated across StereoSet and BBQ benchmarks under zero-shot, chain-of-thought (CoT), and role-injection prompting paradigms, the approach achieves up to a 19% reduction in stereotypical judgments across multiple state-of-the-art LLMs. Notably, the CoT + human role combination exhibits robust, cross-model suppression of gender- and race-related biases. The work advances bias mitigation by grounding prompting strategies in cognitive science principles and providing empirical evidence for System 2 activation as a critical lever for equitable LLM behavior.

Technology Category

Application Category

📝 Abstract

Dual process theory posits that human cognition arises via two systems. System 1, which is a quick, emotional, and intuitive process, which is subject to cognitive biases, and System 2, is a slow, onerous, and deliberate process. NLP researchers often compare zero-shot prompting in LLMs to System 1 reasoning and chain-of-thought (CoT) prompting to System 2. In line with this interpretation, prior research has found that using CoT prompting in LLMs leads to reduced gender bias. We investigate the relationship between bias, CoT prompting, a debiasing prompt, and dual process theory in LLMs directly. We compare zero-shot CoT, debiasing, and a variety of dual process theory-based prompting strategies on two bias datasets spanning nine different social bias categories. We incorporate human and machine personas to determine whether the effects of dual process theory in LLMs exist independent of explicit persona models or are based on modeling human cognition. We find that a human persona, debiasing, System 2, and CoT prompting all tend to reduce social biases in LLMs, though the best combination of features depends on the exact model and bias category -- resulting in up to a 19 percent drop in stereotypical judgments by an LLM.

Problem

Research questions and friction points this paper is trying to address.

Reducing social bias in LLMs using cognitive processes

Comparing prompting strategies across nine bias categories

Investigating persona effects on bias reduction techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual process theory prompting for bias reduction

Combining System 2 reasoning with chain-of-thought

Human persona integration to decrease stereotypes

🔎 Similar Papers

Prompting Fairness: Integrating Causality to Debias Large Language Models