🤖 AI Summary
Pulse oximeters exhibit racial bias—overestimating oxygen saturation in patients with darker skin—potentially leading to inappropriate invasive ventilation decisions in the ICU. This study introduces, for the first time, the path-specific effects framework into clinical fairness evaluation to causally identify both the direct and mediated impacts of this bias on ventilation initiation and duration. We propose a self-normalized doubly robust estimator and establish finite-sample theoretical guarantees for its consistency and asymptotic normality. Using real-world data from MIMIC-IV and eICU, augmented with semi-synthetic validation, we find no statistically significant effect of the bias on ventilation initiation, but a significant prolongation of ventilation duration; moreover, the magnitude and direction of this effect exhibit cross-dataset heterogeneity. These findings underscore the necessity—and irreplaceability—of causal inference methods in rigorously assessing fairness of AI-enabled clinical decision support systems.
📝 Abstract
Identifying and measuring biases associated with sensitive attributes is a crucial consideration in healthcare to prevent treatment disparities. One prominent issue is inaccurate pulse oximeter readings, which tend to overestimate oxygen saturation for dark-skinned patients and misrepresent supplemental oxygen needs. Most existing research has revealed statistical disparities linking device errors to patient outcomes in intensive care units (ICUs) without causal formalization. In contrast, this study causally investigates how racial discrepancies in oximetry measurements affect invasive ventilation in ICU settings. We employ a causal inference-based approach using path-specific effects to isolate the impact of bias by race on clinical decision-making. To estimate these effects, we leverage a doubly robust estimator, propose its self-normalized variant for improved sample efficiency, and provide novel finite-sample guarantees. Our methodology is validated on semi-synthetic data and applied to two large real-world health datasets: MIMIC-IV and eICU. Contrary to prior work, our analysis reveals minimal impact of racial discrepancies on invasive ventilation rates. However, path-specific effects mediated by oxygen saturation disparity are more pronounced on ventilation duration, and the severity differs by dataset. Our work provides a novel and practical pipeline for investigating potential disparities in the ICU and, more crucially, highlights the necessity of causal methods to robustly assess fairness in decision-making.