Targeted Learning for Data Fairness

📅 2025-02-06

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This study addresses fairness assessment during the data generation phase to mitigate inherent discriminatory biases at the data level. We propose the first targeted learning statistical inference framework specifically designed for data generation mechanisms, enabling unbiased and doubly robust estimation of key fairness metrics—including demographic parity, equalized odds, and conditional mutual information. Unlike conventional approaches that intervene only at the algorithmic level, our method systematically integrates causal inference and nonparametric statistical techniques into the quantitative analysis of data fairness. Experiments on synthetic benchmarks and diverse real-world datasets demonstrate that the proposed estimators achieve high accuracy and strong robustness. The framework significantly enhances the interpretability, testability, and practical deployability of data fairness evaluation.

Technology Category

Application Category

📝 Abstract

Data and algorithms have the potential to produce and perpetuate discrimination and disparate treatment. As such, significant effort has been invested in developing approaches to defining, detecting, and eliminating unfair outcomes in algorithms. In this paper, we focus on performing statistical inference for fairness. Prior work in fairness inference has largely focused on inferring the fairness properties of a given predictive algorithm. Here, we expand fairness inference by evaluating fairness in the data generating process itself, referred to here as data fairness. We perform inference on data fairness using targeted learning, a flexible framework for nonparametric inference. We derive estimators demographic parity, equal opportunity, and conditional mutual information. Additionally, we find that our estimators for probabilistic metrics exploit double robustness. To validate our approach, we perform several simulations and apply our estimators to real data.

Problem

Research questions and friction points this paper is trying to address.

Addressing discrimination in data and algorithms

Expanding fairness inference to data generation

Developing estimators for fairness metrics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Targeted learning for data fairness

Nonparametric inference framework

Double robustness in estimators

🔎 Similar Papers

A Survey on Group Fairness in Federated Learning: Challenges, Taxonomy of Solutions and Directions for Future Research