How Do Generative Models Draw a Software Engineer? A Case Study on Stable Diffusion Bias

📅 2025-01-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study presents the first systematic evaluation of gender and ethnic bias in Stable Diffusion (SD 2, SD XL, and SD 3) when generating software engineering–related imagery. Addressing the problem of underexamined demographic biases in text-to-image generative models within software engineering contexts, we analyze 6,720 synthetically generated images—prompted by terms such as “Software Engineer”—using human annotation and fine-grained statistical analysis. Results reveal consistent and significant male bias across all versions; SD 2 and SD XL strongly overrepresent White individuals, SD 3 exhibits a slight Asian preference, while Black and Arab representations are severely underrepresented. Critically, bias persists despite prompt engineering interventions. As the first large-scale empirical study of bias in text-to-image generation for software engineering, this work establishes a methodological foundation for fairness assessment and governance of AI-generated content, demonstrating that prompt semantics critically modulate bias intensity.

Technology Category

Application Category

📝 Abstract
Generative models are nowadays widely used to generate graphical content used for multiple purposes, e.g. web, art, advertisement. However, it has been shown that the images generated by these models could reinforce societal biases already existing in specific contexts. In this paper, we focus on understanding if this is the case when one generates images related to various software engineering tasks. In fact, the Software Engineering (SE) community is not immune from gender and ethnicity disparities, which could be amplified by the use of these models. Hence, if used without consciousness, artificially generated images could reinforce these biases in the SE domain. Specifically, we perform an extensive empirical evaluation of the gender and ethnicity bias exposed by three versions of the Stable Diffusion (SD) model (a very popular open-source text-to-image model) - SD 2, SD XL, and SD 3 - towards SE tasks. We obtain 6,720 images by feeding each model with two sets of prompts describing different software-related tasks: one set includes the Software Engineer keyword, and one set does not include any specification of the person performing the task. Next, we evaluate the gender and ethnicity disparities in the generated images. Results show how all models are significantly biased towards male figures when representing software engineers. On the contrary, while SD 2 and SD XL are strongly biased towards White figures, SD 3 is slightly more biased towards Asian figures. Nevertheless, all models significantly under-represent Black and Arab figures, regardless of the prompt style used. The results of our analysis highlight severe concerns about adopting those models to generate content for SE tasks and open the field for future research on bias mitigation in this context.
Problem

Research questions and friction points this paper is trying to address.

Bias Amplification
Diversity Representation
Generative Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Models Bias
Gender and Racial Bias
Fairness in AI
🔎 Similar Papers
No similar papers found.