Reliable fairness auditing with semi-supervised inference

📅 2025-05-18

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Biomedical machine learning models frequently exhibit subgroup bias, and fairness auditing typically requires large labeled datasets—costly to obtain. This paper proposes Infairness, a semi-supervised fairness auditing framework that leverages only a small number of labeled samples alongside abundant unlabeled data. It employs nonlinear basis function regression to impute missing model predictions and unifies the estimation of multiple fairness metrics—including statistical parity and equal opportunity—within a single modeling framework. We theoretically establish consistency of the proposed estimator regardless of whether the imputation model is correctly specified; moreover, when the imputation model is well-specified, the estimator is asymptotically more efficient than fully supervised alternatives. Empirical evaluation demonstrates: (i) consistently higher accuracy in synthetic experiments; and (ii) a 64% reduction in estimation variance on a real-world electronic health record dataset for depression phenotyping, substantially improving reliability and precision of fairness assessment under limited labeling budgets.

Technology Category

Application Category

📝 Abstract

Machine learning (ML) models often exhibit bias that can exacerbate inequities in biomedical applications. Fairness auditing, the process of evaluating a model's performance across subpopulations, is critical for identifying and mitigating these biases. However, such audits typically rely on large volumes of labeled data, which are costly and labor-intensive to obtain. To address this challenge, we introduce $ extit{Infairness}$, a unified framework for auditing a wide range of fairness criteria using semi-supervised inference. Our approach combines a small labeled dataset with a large unlabeled dataset by imputing missing outcomes via regression with carefully selected nonlinear basis functions. We show that our proposed estimator is (i) consistent regardless of whether the ML or imputation models are correctly specified and (ii) more efficient than standard supervised estimation with the labeled data when the imputation model is correctly specified. Through extensive simulations, we also demonstrate that Infairness consistently achieves higher precision than supervised estimation. In a real-world application of phenotyping depression from electronic health records data, Infairness reduces variance by up to 64% compared to supervised estimation, underscoring its value for reliable fairness auditing with limited labeled data.

Problem

Research questions and friction points this paper is trying to address.

Auditing ML model fairness with limited labeled data

Reducing bias in biomedical applications via semi-supervised inference

Improving efficiency and precision in fairness evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semi-supervised inference for fairness auditing

Imputing missing outcomes via nonlinear regression

Higher precision with limited labeled data

🔎 Similar Papers

No similar papers found.