🤖 AI Summary
Existing facial datasets exhibit significant demographic biases—particularly along racial and gender dimensions—undermining the fairness and robustness of face recognition systems. To address this, we propose an identity-constrained generative modeling framework that synthesizes high-diversity, demographically balanced synthetic face images compliant with ID-photo standards. Leveraging this method, we construct DIF-V, a novel benchmark dataset comprising 926 identities and 27,780 images. DIF-V is the first to systematically expose performance degradation under identity-style variation and cross-group bias in mainstream models. Extensive experiments demonstrate that training on DIF-V substantially mitigates gender and racial bias while improving model fairness and out-of-distribution generalization. This work establishes a new evaluation benchmark, introduces a principled synthesis methodology, and provides empirical evidence to advance ethical AI assessment and inclusive facial recognition technology.
📝 Abstract
Face verification is a significant component of identity authentication in various applications including online banking and secure access to personal devices. The majority of the existing face image datasets often suffer from notable biases related to race, gender, and other demographic characteristics, limiting the effectiveness and fairness of face verification systems. In response to these challenges, we propose a comprehensive methodology that integrates advanced generative models to create varied and diverse high-quality synthetic face images. This methodology emphasizes the representation of a diverse range of facial traits, ensuring adherence to characteristics permissible in identity card photographs. Furthermore, we introduce the Diverse and Inclusive Faces for Verification (DIF-V) dataset, comprising 27,780 images of 926 unique identities, designed as a benchmark for future research in face verification. Our analysis reveals that existing verification models exhibit biases toward certain genders and races, and notably, applying identity style modifications negatively impacts model performance. By tackling the inherent inequities in existing datasets, this work not only enriches the discussion on diversity and ethics in artificial intelligence but also lays the foundation for developing more inclusive and reliable face verification technologies