Race and Gender in LLM-Generated Personas: A Large-Scale Audit of 41 Occupations

📅 2025-10-23

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This study audits racial and gender representation biases in four mainstream large language models (LLMs) when generating portraits of 41 U.S. occupations. Method: Leveraging over 1.5 million prompt-generated images and benchmarking against U.S. Bureau of Labor Statistics demographic data, the study conducts a quantitative, large-scale fairness audit. Contribution/Results: It reveals systematic underrepresentation of White (−31 percentage points) and Black (−9pp) individuals, and overrepresentation of Hispanic (+17pp) and Asian (+12pp) individuals; domestic workers are nearly exclusively Hispanic, while Black individuals are nearly absent across multiple occupations. The work identifies two novel bias patterns—“systematic shift” and “stereotype amplification”—and demonstrates that model origin country policies and safety alignment levels significantly influence representational fairness. These findings advance application-specific, model-tailored auditing frameworks and provide empirical grounding and methodological guidance for responsible AI development.

Technology Category

Application Category

📝 Abstract

Generative AI tools are increasingly used to create portrayals of people in occupations, raising concerns about how race and gender are represented. We conducted a large-scale audit of over 1.5 million occupational personas across 41 U.S. occupations, generated by four large language models with different AI safety commitments and countries of origin (U.S., China, France). Compared with Bureau of Labor Statistics data, we find two recurring patterns: systematic shifts, where some groups are consistently under- or overrepresented, and stereotype exaggeration, where existing demographic skews are amplified. On average, White (--31pp) and Black (--9pp) workers are underrepresented, while Hispanic (+17pp) and Asian (+12pp) workers are overrepresented. These distortions can be extreme: for example, across all four models, Housekeepers are portrayed as nearly 100% Hispanic, while Black workers are erased from many occupations. For HCI, these findings show provider choice materially changes who is visible, motivating model-specific audits and accountable design practices.

Problem

Research questions and friction points this paper is trying to address.

Auditing racial and gender representation in LLM-generated occupational personas

Identifying systematic demographic shifts and stereotype exaggeration patterns

Measuring disparities between generated portrayals and official labor statistics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale audit of 1.5 million AI personas

Comparing four LLMs across 41 occupations

Identifying systematic demographic shifts and stereotypes

🔎 Similar Papers

Evaluating Gender, Racial, and Age Biases in Large Language Models: A Comparative Analysis of Occupational and Crime Scenarios