🤖 AI Summary
This work identifies a systemic amplification of intersectional social biases in vision-language models (VLMs), stemming from their reliance on spurious statistical correlations rather than socially grounded contextual reasoning—particularly undermining fairness in occupation prediction. Using the FairFace dataset, we conduct experiments on five open-source VLMs, sampling reasoning trajectories via three distinct prompt styles. Integrating quantitative predictive evaluation with qualitative attribution analysis, we establish—for the first time—the causal linkage between model reasoning pathways and intersectional bias generation. Results reveal consistent and statistically significant bias patterns across 32 occupational categories, demonstrating that the inference process itself constitutes a critical stage for bias propagation and amplification. We thus argue for “reasoning alignment”: prior to deployment, VLMs must be calibrated so their internal reasoning logic aligns with human values and sociocultural knowledge. This reframes fairness governance for VLMs, proposing a novel paradigm centered on interpretability-aware bias mitigation.
📝 Abstract
Vision Language Models (VLMs) are increasingly deployed across downstream tasks, yet their training data often encode social biases that surface in outputs. Unlike humans, who interpret images through contextual and social cues, VLMs process them through statistical associations, often leading to reasoning that diverges from human reasoning. By analyzing how a VLM reasons, we can understand how inherent biases are perpetuated and can adversely affect downstream performance. To examine this gap, we systematically analyze social biases in five open-source VLMs for an occupation prediction task, on the FairFace dataset. Across 32 occupations and three different prompting styles, we elicit both predictions and reasoning. Our findings reveal that the biased reasoning patterns systematically underlie intersectional disparities, highlighting the need to align VLM reasoning with human values prior to its downstream deployment.