🤖 AI Summary
This work addresses the pervasive issue of social bias amplification in current text-to-image (T2I) generation models and the lack of systematic evaluation across multidimensional semantic axes. To this end, the authors propose HoloFair, a comprehensive fairness evaluation framework that introduces a large-scale fairness-oriented dataset, a spatial-frequency joint attribute classifier (SpaFreq), and a Multi-Attribute Group Bias Index (MGBI), enabling the first unified quantification of fairness across multiple demographic attributes in T2I models. Building upon this framework, they further develop Fair-GRPO, a multi-objective reinforcement learning debiasing algorithm based on GRPO. Experiments on SD3.5-Medium demonstrate that Fair-GRPO significantly enhances multidimensional fairness while preserving high image quality and effectively mitigating reward hacking.
📝 Abstract
Text-to-Image (T2I) models have made significant strides in visual realism and semantic consistency, yet they often perpetuate and amplify societal biases. Existing evaluation methods typically address only single-dimensional biases, lacking perspectives to uncover model biases at social-related deeper semantic levels. We introduce HoloFair, a comprehensive benchmark framework for multidimensional demographic bias analysis. Built upon our large-scale fairness-oriented dataset and the SpaFreq (Spatial-Frequency) attribute classifier, this framework proposes the Multi-attribute, Group-wise Bias Index (MGBI) metric, designed to assess both intrinsic diversity and conditional biases. Beyond evaluation, we further introduce Fair-GRPO, a reinforcement-learning-based debiasing method that alters the distribution of generative models through a designed multi-objective reward function. E.g., experiments on the SD3.5-Medium model demonstrate that Fair-GRPO significantly improves multidimensional fairness while maintaining high image quality. We also analyze potential reward hacking phenomena and provide corresponding mitigation strategies. Code and dataset are available at https://github.com/1059684669/HoloFair