Reliable and Reproducible Demographic Inference for Fairness in Face Analysis

📅 2025-10-23

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Fairness evaluation of facial analysis systems commonly relies on demographic attribute inference (DAI), yet its reliability is constrained by predefined, static demographic categories—introducing systematic bias. This work proposes a modular transfer learning framework: a fixed, pretrained face recognition encoder serves as the backbone, augmented with lightweight nonlinear classification heads; we further introduce a robust evaluation metric grounded in intra-identity consistency, enabling flexible, configurable multi-granularity demographic partitioning. The approach substantially improves DAI stability and cross-dataset generalization, outperforming strong baselines across gender and ethnicity inference tasks—particularly for ethnicity classification. To foster reproducible and trustworthy fairness auditing, we publicly release comprehensive metadata, source code, trained models, and an integrated evaluation toolkit.

Technology Category

Application Category

📝 Abstract

Fairness evaluation in face analysis systems (FAS) typically depends on automatic demographic attribute inference (DAI), which itself relies on predefined demographic segmentation. However, the validity of fairness auditing hinges on the reliability of the DAI process. We begin by providing a theoretical motivation for this dependency, showing that improved DAI reliability leads to less biased and lower-variance estimates of FAS fairness. To address this, we propose a fully reproducible DAI pipeline that replaces conventional end-to-end training with a modular transfer learning approach. Our design integrates pretrained face recognition encoders with non-linear classification heads. We audit this pipeline across three dimensions: accuracy, fairness, and a newly introduced notion of robustness, defined via intra-identity consistency. The proposed robustness metric is applicable to any demographic segmentation scheme. We benchmark the pipeline on gender and ethnicity inference across multiple datasets and training setups. Our results show that the proposed method outperforms strong baselines, particularly on ethnicity, which is the more challenging attribute. To promote transparency and reproducibility, we will publicly release the training dataset metadata, full codebase, pretrained models, and evaluation toolkit. This work contributes a reliable foundation for demographic inference in fairness auditing.

Problem

Research questions and friction points this paper is trying to address.

Improving demographic inference reliability for accurate fairness evaluation

Addressing bias and variance in face analysis system fairness audits

Developing robust demographic attribute inference across multiple segmentation schemes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular transfer learning replaces end-to-end training

Integrates pretrained encoders with non-linear classifiers

Introduces robustness via intra-identity consistency metric

🔎 Similar Papers

No similar papers found.