🤖 AI Summary
This work addresses the pervasive lack of geographic diversity in text-to-image generation models, which often reinforce stereotypical portrayals of specific countries. To systematically evaluate and quantify such geographic bias, the authors propose GeoDiv, a novel and interpretable framework that leverages large language models and vision-language models to assess generated images along two dimensions: the Socioeconomic Visual Index (SEVI) and the Visual Diversity Index (VDI). Experiments across 10 entity categories and 16 countries reveal significant misrepresentations—particularly depicting nations like India and Nigeria as impoverished and dilapidated—exposing deep-seated socioeconomic biases in current models. The study establishes a new paradigm for fairness evaluation and mitigation in generative AI by providing a structured, quantifiable approach to diagnosing geographic representation disparities.
📝 Abstract
Text-to-image (T2I) models are rapidly gaining popularity, yet their outputs often lack geographical diversity, reinforce stereotypes, and misrepresent regions. Given their broad reach, it is critical to rigorously evaluate how these models portray the world. Existing diversity metrics either rely on curated datasets or focus on surface-level visual similarity, limiting interpretability. We introduce GeoDiv, a framework leveraging large language and vision-language models to assess geographical diversity along two complementary axes: the Socio-Economic Visual Index (SEVI), capturing economic and condition-related cues, and the Visual Diversity Index (VDI), measuring variation in primary entities and backgrounds. Applied to images generated by models such as Stable Diffusion and FLUX.1-dev across $10$ entities and $16$ countries, GeoDiv reveals a consistent lack of diversity and identifies fine-grained attributes where models default to biased portrayals. Strikingly, depictions of countries like India, Nigeria, and Colombia are disproportionately impoverished and worn, reflecting underlying socio-economic biases. These results highlight the need for greater geographical nuance in generative models. GeoDiv provides the first systematic, interpretable framework for measuring such biases, marking a step toward fairer and more inclusive generative systems. Project page: https://abhipsabasu.github.io/geodiv