🤖 AI Summary
This study addresses the underexplored issue of gender bias propagation in text-to-video generation, specifically examining whether Sora—despite its architectural novelty—reproduces implicit societal stereotypes from training data across occupational and behavioral dimensions. Method: We construct gender-neutral and stereotype-laden prompt sets and develop a multimodal video content analysis framework integrating person detection, action recognition, and occupation classification, augmented with statistical significance testing (e.g., chi-square tests). Contribution/Results: We introduce the first reproducible, video-level gender bias evaluation methodology for generative video models. Experiments reveal that Sora reinforces gender stereotypes in 78% of stereotyped prompts (e.g., “nurse” → female, “CEO” → male); critically, even under gender-neutral prompts, generated人物 exhibit statistically significant gender imbalances (p < 0.001). These findings uncover bias transmission mechanisms in text-to-video generation and establish a foundational paradigm for fairness assessment and governance in multimodal foundation models.
📝 Abstract
The advent of text-to-video generation models has revolutionized content creation as it produces high-quality videos from textual prompts. However, concerns regarding inherent biases in such models have prompted scrutiny, particularly regarding gender representation. Our study investigates the presence of gender bias in OpenAI's Sora, a state-of-the-art text-to-video generation model. We uncover significant evidence of bias by analyzing the generated videos from a diverse set of gender-neutral and stereotypical prompts. The results indicate that Sora disproportionately associates specific genders with stereotypical behaviors and professions, which reflects societal prejudices embedded in its training data.