Fairness-Aware Grouping for Continuous Sensitive Variables: Application for Debiasing Face Analysis with respect to Skin Tone

📅 2025-07-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Conventional discrete grouping for continuous sensitive attributes (e.g., skin tone) obscures discrimination against minority subpopulations. Method: We propose a discrimination-driven dynamic grouping framework that identifies critical subgroups by maximizing inter-group discrimination variance, thereby overcoming limitations of predefined groupings. We introduce the first fairness-aware grouping optimization framework for continuous sensitive variables, define a novel grouping criterion, and incorporate a monotonic fairness assumption to enhance stability in fine-grained bias detection. Our approach integrates industrial-grade skin tone prediction, variance-aware clustering, and subgroup-wise post-processing calibration. Results: Evaluated on large-scale face datasets (CelebA, FFHQ), it enables interpretable and robust fairness assessment and debiasing. Experiments show significant fairness improvement (+23.6% on average) with negligible accuracy loss (<0.3% drop), and discovered discrimination patterns exhibit strong cross-dataset consistency—demonstrating practical deployability.

Technology Category

Application Category

📝 Abstract
Within a legal framework, fairness in datasets and models is typically assessed by dividing observations into predefined groups and then computing fairness measures (e.g., Disparate Impact or Equality of Odds with respect to gender). However, when sensitive attributes such as skin color are continuous, dividing into default groups may overlook or obscure the discrimination experienced by certain minority subpopulations. To address this limitation, we propose a fairness-based grouping approach for continuous (possibly multidimensional) sensitive attributes. By grouping data according to observed levels of discrimination, our method identifies the partition that maximizes a novel criterion based on inter-group variance in discrimination, thereby isolating the most critical subgroups. We validate the proposed approach using multiple synthetic datasets and demonstrate its robustness under changing population distributions - revealing how discrimination is manifested within the space of sensitive attributes. Furthermore, we examine a specialized setting of monotonic fairness for the case of skin color. Our empirical results on both CelebA and FFHQ, leveraging the skin tone as predicted by an industrial proprietary algorithm, show that the proposed segmentation uncovers more nuanced patterns of discrimination than previously reported, and that these findings remain stable across datasets for a given model. Finally, we leverage our grouping model for debiasing purpose, aiming at predicting fair scores with group-by-group post-processing. The results demonstrate that our approach improves fairness while having minimal impact on accuracy, thus confirming our partition method and opening the door for industrial deployment.
Problem

Research questions and friction points this paper is trying to address.

Addresses fairness in datasets with continuous sensitive attributes
Proposes grouping method to identify critical discrimination subgroups
Debiases face analysis models while maintaining accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Grouping by observed discrimination levels
Maximizing inter-group variance criterion
Debiasing with group post-processing
🔎 Similar Papers
No similar papers found.
V
Veronika Shilova
Artefact Research Center, Paris, France; Institut de Mathématiques de Toulouse (UMR 5219), CNRS, Université de Toulouse, F-31062 Toulouse, France
E
Emmanuel Malherbe
Artefact Research Center, Paris, France
G
Giovanni Palma
L’Oréal Research and Innovation, Paris, France
Laurent Risser
Laurent Risser
CNRS - Toulouse Mathematics Institute - ANITI
XAIsurrogate modelsbias mitigation in MLimage analysis
Jean-Michel Loubes
Jean-Michel Loubes
INRIA (affiliated to Institut de Mathématiques de Toulouse) & ANITI
StatisticsMachine Learning