Evaluating Social Biases in LLM Reasoning

📅 2025-02-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Prior bias evaluations of large language models (LLMs) focus predominantly on input-output disparities, overlooking how internal reasoning processes may actively amplify social biases. Method: This work systematically investigates bias dynamics across multi-step chain-of-thought (CoT) reasoning using the BBQ benchmark, analyzing DeepSeek-R1 (8B/32B) and its instruction-tuned variants. We introduce bias path tracing to quantify bias propagation and amplification at each reasoning step. Contribution/Results: We demonstrate that CoT reasoning significantly exacerbates gender and racial biases—irrespective of model scale or instruction tuning—with bias intensity increasing up to 2.3× in certain cases. Crucially, this is the first empirical evidence that LLM reasoning mechanisms themselves function as bias amplifiers. These findings shift the paradigm for bias mitigation from end-to-end alignment toward process-aware, stepwise controllability, providing foundational insights for developing transparent and accountable reasoning systems.

Technology Category

Application Category

📝 Abstract
In the recent development of AI reasoning, large language models (LLMs) are trained to automatically generate chain-of-thought reasoning steps, which have demonstrated compelling performance on math and coding tasks. However, when bias is mixed within the reasoning process to form strong logical arguments, it could cause even more harmful results and further induce hallucinations. In this paper, we have evaluated the 8B and 32B variants of DeepSeek-R1 against their instruction tuned counterparts on the BBQ dataset, and investigated the bias that is elicited out and being amplified through reasoning steps. To the best of our knowledge, this empirical study is the first to assess bias issues in LLM reasoning.
Problem

Research questions and friction points this paper is trying to address.

Evaluating social biases in LLM reasoning
Assessing bias amplification in reasoning steps
Investigating bias elicitation in instruction-tuned models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Assessed bias in LLM reasoning
Used DeepSeek-R1 variants
Evaluated on BBQ dataset
🔎 Similar Papers
No similar papers found.