Beneath the Surface: How Large Language Models Reflect Hidden Bias

📅 2025-02-27

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Existing bias evaluation benchmarks rely on explicit term associations, making them susceptible to circumvention by large language models (LLMs) and ineffective at uncovering implicit social biases embedded in naturalistic contexts. To address this limitation, we propose HBB—the first benchmark for evaluating *implicit* bias—introducing a novel evaluation paradigm grounded in fine-grained, real-world situational prompts. HBB quantifies context-embedded bias through multi-model response analysis and systematic bias attribution. Experiments across six state-of-the-art LLMs reveal that while explicit bias has markedly decreased, implicit bias remains pervasive and is entirely missed by current benchmarks. HBB is the first framework to systematically expose the “superficially neutral but substantively biased” behavior of LLMs. To foster reproducibility and community advancement, we publicly release the benchmark code and data.

Technology Category

Application Category

📝 Abstract

The exceptional performance of Large Language Models (LLMs) often comes with the unintended propagation of social biases embedded in their training data. While existing benchmarks evaluate overt bias through direct term associations between bias concept terms and demographic terms, LLMs have become increasingly adept at avoiding biased responses, creating an illusion of neutrality. However, biases persist in subtler, contextually hidden forms that traditional benchmarks fail to capture. We introduce the Hidden Bias Benchmark (HBB), a novel dataset designed to assess hidden bias that bias concepts are hidden within naturalistic, subtly framed contexts in real-world scenarios. We analyze six state-of-the-art LLMs, revealing that while models reduce bias in response to overt bias, they continue to reinforce biases in nuanced settings. Data, code, and results are available at https://github.com/JP-25/Hidden-Bias-Benchmark.

Problem

Research questions and friction points this paper is trying to address.

Detect hidden biases in Large Language Models.

Assess biases in nuanced, real-world contexts.

Develop a benchmark for subtle bias evaluation.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hidden Bias Benchmark

naturalistic contexts

state-of-the-art LLMs

🔎 Similar Papers

From Prejudice to Parity: A New Approach to Debiasing Large Language Model Word Embeddings