Beyond Overcorrection: Evaluating Diversity in T2I Models with DIVBENCH

📅 2025-07-02

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

Existing diversity evaluation in text-to-image (T2I) generation suffers from structural flaws—namely, “over-diversification” (e.g., unauthorized alteration of demographic attributes specified in prompts) and “under-diversification” (e.g., insufficient representation)—due to neglect of contextual constraints. Method: We propose DIVBENCH, the first benchmark framework that systematically distinguishes, quantifies, and jointly evaluates both imbalances via context-aware semantic fidelity constraints, moving beyond conventional unidirectional diversity maximization. It integrates LLM-guided FairDiffusion and context-aware prompt rewriting for systematic assessment across mainstream T2I models. Contribution/Results: Experiments reveal widespread under-diversification in current models; existing augmentation methods often induce over-correction; in contrast, DIVBENCH-driven context-aware strategies significantly improve fairness and representational balance while strictly preserving semantic accuracy—achieving an optimal trade-off between diversity and fidelity.

Technology Category

Application Category

📝 Abstract

Current diversification strategies for text-to-image (T2I) models often ignore contextual appropriateness, leading to over-diversification where demographic attributes are modified even when explicitly specified in prompts. This paper introduces DIVBENCH, a benchmark and evaluation framework for measuring both under- and over-diversification in T2I generation. Through systematic evaluation of state-of-the-art T2I models, we find that while most models exhibit limited diversity, many diversification approaches overcorrect by inappropriately altering contextually-specified attributes. We demonstrate that context-aware methods, particularly LLM-guided FairDiffusion and prompt rewriting, can already effectively address under-diversity while avoiding over-diversification, achieving a better balance between representation and semantic fidelity.

Problem

Research questions and friction points this paper is trying to address.

Evaluating diversity issues in text-to-image models

Addressing over-diversification of demographic attributes

Balancing representation and semantic fidelity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces DIVBENCH benchmark for T2I diversity evaluation

Uses LLM-guided FairDiffusion for context-aware diversification

Employs prompt rewriting to balance representation and fidelity

🔎 Similar Papers

Standardizing the Measurement of Text Diversity: A Tool and a Comparative Analysis of Scores