Bias Delayed is Bias Denied? Assessing the Effect of Reporting Delays on Disparity Assessments

📅 2025-06-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study identifies a systemic bias in healthcare equity assessment arising from delayed reporting of demographic attributes (e.g., race/ethnicity). Leveraging 5 million real-world electronic health records, we first quantify the heterogeneous, spatiotemporally uneven patterns of such delays across population subgroups and develop a missingness mechanism model alongside a multi-level (national/state/clinic) temporal attribution framework. We find substantial average delays with pronounced intergroup disparities, causing directional misclassification in over 32% of state-level and 68% of clinic-level health disparity conclusions. Conventional imputation reduces misclassification by only 11%, underscoring the critical impact of timeliness bias. The paper introduces a novel fairness evaluation paradigm explicitly accounting for data pipeline latency, establishing both a methodological foundation and practical guidelines for real-world equity auditing.

Technology Category

Application Category

📝 Abstract
Conducting disparity assessments at regular time intervals is critical for surfacing potential biases in decision-making and improving outcomes across demographic groups. Because disparity assessments fundamentally depend on the availability of demographic information, their efficacy is limited by the availability and consistency of available demographic identifiers. While prior work has considered the impact of missing data on fairness, little attention has been paid to the role of delayed demographic data. Delayed data, while eventually observed, might be missing at the critical point of monitoring and action -- and delays may be unequally distributed across groups in ways that distort disparity assessments. We characterize such impacts in healthcare, using electronic health records of over 5M patients across primary care practices in all 50 states. Our contributions are threefold. First, we document the high rate of race and ethnicity reporting delays in a healthcare setting and demonstrate widespread variation in rates at which demographics are reported across different groups. Second, through a set of retrospective analyses using real data, we find that such delays impact disparity assessments and hence conclusions made across a range of consequential healthcare outcomes, particularly at more granular levels of state-level and practice-level assessments. Third, we find limited ability of conventional methods that impute missing race in mitigating the effects of reporting delays on the accuracy of timely disparity assessments. Our insights and methods generalize to many domains of algorithmic fairness where delays in the availability of sensitive information may confound audits, thus deserving closer attention within a pipeline-aware machine learning framework.
Problem

Research questions and friction points this paper is trying to address.

Assessing impact of delayed demographic data on disparity assessments
Documenting unequal race/ethnicity reporting delays in healthcare settings
Evaluating limitations of conventional methods in mitigating delay effects
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzing race reporting delays in healthcare data
Assessing delay impacts on disparity evaluations
Testing conventional imputation methods' limitations
🔎 Similar Papers
No similar papers found.
J
Jennah Gosciak
Cornell University
A
Aparna Balagopalan
Massachusetts Institute of Technology
D
Derek Ouyang
Stanford University
Allison Koenecke
Allison Koenecke
Asst. Prof., Cornell University
M
Marzyeh Ghassemi
Massachusetts Institute of Technology
Daniel E. Ho
Daniel E. Ho
Stanford University
Regulatory policyartificial intelligenceadministrative lawantidiscrimination