Beyond Content: How Grammatical Gender Shapes Visual Representation in Text-to-Image Models

📅 2025-08-05

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Prior bias research in multimodal AI has largely overlooked the influence of linguistic structure—particularly grammatical gender—on visual representations generated by text-to-image (T2I) models. Method: We construct a cross-lingual benchmark covering five grammatical-gender languages and two grammatical-gender-neutral languages, and systematically evaluate three state-of-the-art T2I models, generating 28,800 images for quantitative analysis. Contribution/Results: We provide the first empirical evidence that grammatical gender induces systematic visual biases: masculine grammatical gender increases male representation to 73%, while feminine gender elevates female representation to 38%—both significantly deviating from the English gender-neutral baseline. We formally introduce “grammatical gender” as a critical new dimension of multimodal fairness, offering both theoretical grounding and empirical validation for how syntactic properties of language shape AI-generated visual content. This work bridges a longstanding gap in bias assessment by integrating linguistic typology into multimodal fairness evaluation.

Technology Category

Application Category

📝 Abstract

Research on bias in Text-to-Image (T2I) models has primarily focused on demographic representation and stereotypical attributes, overlooking a fundamental question: how does grammatical gender influence visual representation across languages? We introduce a cross-linguistic benchmark examining words where grammatical gender contradicts stereotypical gender associations (e.g., ``une sentinelle'' - grammatically feminine in French but referring to the stereotypically masculine concept ``guard''). Our dataset spans five gendered languages (French, Spanish, German, Italian, Russian) and two gender-neutral control languages (English, Chinese), comprising 800 unique prompts that generated 28,800 images across three state-of-the-art T2I models. Our analysis reveals that grammatical gender dramatically influences image generation: masculine grammatical markers increase male representation to 73% on average (compared to 22% with gender-neutral English), while feminine grammatical markers increase female representation to 38% (compared to 28% in English). These effects vary systematically by language resource availability and model architecture, with high-resource languages showing stronger effects. Our findings establish that language structure itself, not just content, shapes AI-generated visual outputs, introducing a new dimension for understanding bias and fairness in multilingual, multimodal systems.

Problem

Research questions and friction points this paper is trying to address.

How grammatical gender influences visual representation in T2I models

Impact of grammatical gender on gender bias in image generation

Cross-linguistic analysis of grammatical gender effects in AI outputs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-linguistic benchmark for grammatical gender bias

Dataset spans five gendered and two neutral languages

Grammatical gender markers significantly influence image generation

🔎 Similar Papers

Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You