A pipeline for enabling path-specific causal fairness in observational health data

📅 2026-01-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a general, model-agnostic causal fairness training framework to address the risk that medical machine learning models may replicate or exacerbate biases in healthcare. For the first time in a clinical context, the approach disentangles direct and indirect pathways of bias by leveraging structural causal models to map path-specific causal fairness constraints onto observational health data. The method employs an unconstrained base model to generate downstream predictions, thereby preserving predictive accuracy while enabling generalizable fairness improvements. Furthermore, it systematically characterizes the trade-off between fairness and accuracy, offering a principled way to balance these often competing objectives in real-world medical applications.

Technology Category

Application Category

📝 Abstract
When training machine learning (ML) models for potential deployment in a healthcare setting, it is essential to ensure that they do not replicate or exacerbate existing healthcare biases. Although many definitions of fairness exist, we focus on path-specific causal fairness, which allows us to better consider the social and medical contexts in which biases occur (e.g., direct discrimination by a clinician or model versus bias due to differential access to the healthcare system) and to characterize how these biases may appear in learned models. In this work, we map the structural fairness model to the observational healthcare setting and create a generalizable pipeline for training causally fair models. The pipeline explicitly considers specific healthcare context and disparities to define a target"fair"model. Our work fills two major gaps: first, we expand on characterizations of the"fairness-accuracy"tradeoff by detangling direct and indirect sources of bias and jointly presenting these fairness considerations alongside considerations of accuracy in the context of broadly known biases. Second, we demonstrate how a foundation model trained without fairness constraints on observational health data can be leveraged to generate causally fair downstream predictions in tasks with known social and medical disparities. This work presents a model-agnostic pipeline for training causally fair machine learning models that address both direct and indirect forms of healthcare bias.
Problem

Research questions and friction points this paper is trying to address.

causal fairness
healthcare bias
path-specific fairness
observational health data
fairness-accuracy tradeoff
Innovation

Methods, ideas, or system contributions that make the work stand out.

path-specific causal fairness
observational health data
fairness-accuracy tradeoff
causal fairness pipeline
healthcare bias
🔎 Similar Papers
No similar papers found.
A
Aparajita Kashyap
Columbia University Department of Biomedical Informatics, USA
S
Sara Matijevic
University of Oxford Big Data Institute, UK
Noémie Elhadad
Noémie Elhadad
Associate Professor and Chair of Biomedical Informatics, Columbia University
machine learning for healthcarehealth informaticsnatural language processingbiomedical informaticswomen's health
S
Steven A. Kushner
Columbia University Department of Psychiatry, USA
Shalmali Joshi
Shalmali Joshi
Columbia University
Artificial IntelligenceMachine LearningBiomedical SciencesClinical Informatics