Examining the Robustness of Homogeneity Bias to Hyperparameter Adjustments in GPT-4

📅 2025-01-04

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This study investigates the robustness of homogenization bias—i.e., excessive similarity among narrative embeddings across individuals within racial and gender groups—in GPT-4’s social representations, with respect to sampling temperature and top-p hyperparameters. Using systematic hyperparameter sweeps, multi-group narrative generation, CLIP- and LLM-based text embeddings, and cosine-similarity-based clustering, we uncover, for the first time, a nonlinear and socially heterogeneous response: increasing temperature or decreasing top-p mitigates homogenization in Black representations but exacerbates or leaves unchanged that in female representations. Empirically, the bias persists across nearly all configurations, and no single hyperparameter tuning strategy universally alleviates cross-dimensional bias. Our findings reveal the deep structural nature of social representation bias in large language models, introducing a novel methodology for bias assessment and providing critical empirical evidence for targeted intervention design.

Technology Category

Application Category

📝 Abstract

Vision-Language Models trained on massive collections of human-generated data often reproduce and amplify societal stereotypes. One critical form of stereotyping reproduced by these models is homogeneity bias-the tendency to represent certain groups as more homogeneous than others. We investigate how this bias responds to hyperparameter adjustments in GPT-4, specifically examining sampling temperature and top p which control the randomness of model outputs. By generating stories about individuals from different racial and gender groups and comparing their similarities using vector representations, we assess both bias robustness and its relationship with hyperparameter values. We find that (1) homogeneity bias persists across most hyperparameter configurations, with Black Americans and women being represented more homogeneously than White Americans and men, (2) the relationship between hyperparameters and group representations shows unexpected non-linear patterns, particularly at extreme values, and (3) hyperparameter adjustments affect racial and gender homogeneity bias differently-while increasing temperature or decreasing top p can reduce racial homogeneity bias, these changes show different effects on gender homogeneity bias. Our findings suggest that while hyperparameter tuning may mitigate certain biases to some extent, it cannot serve as a universal solution for addressing homogeneity bias across different social group dimensions.

Problem

Research questions and friction points this paper is trying to address.

GPT-4 Model

Bias Stability

Sampling Controls

Innovation

Methods, ideas, or system contributions that make the work stand out.

GPT-4 Model Bias

Temperature and Top P Adjustment

Stereotype Mitigation in AI

🔎 Similar Papers

Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation