What's the Weight? Estimating Controlled Outcome Differences in Complex Surveys for Health Disparities Research

📅 2024-06-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses bias in causal inference arising from associations between sensitive variables (e.g., race) and survey sampling weights in complex surveys. To resolve this, we propose an identifiable framework for estimating the Average Controlled Difference (ACD). Methodologically, we first systematically tackle the challenge that propensity score modeling must account for treatment-group–specific survey weights; we integrate inverse probability weighting, stratified or calibrated survey weighting, multiple imputation, and robust variance estimation, and implement these in the R package *svycdiff*. Applied to NHANES data to assess telomere length differences between Black and White adults, our approach—adjusting for socioeconomic confounders and weight-induced bias—substantially attenuates the estimated disparity. Simulation studies demonstrate that the proposed method uniformly outperforms conventional approaches in terms of bias, mean squared error, and confidence interval coverage, while simultaneously achieving covariate balance and population generalizability.

Technology Category

Application Category

📝 Abstract
A basic descriptive question in statistics often asks whether there are differences in mean outcomes between groups based on levels of a discrete covariate (e.g., racial disparities in health outcomes). However, when this categorical covariate of interest is correlated with other factors related to the outcome, direct comparisons may lead to biased estimates and invalid inferential conclusions without appropriate adjustment. Propensity score methods are broadly employed with observational data as a tool to achieve covariate balance, but how to implement them in complex surveys is less studied - in particular, when the survey weights depend on the group variable under comparison. In this work, we focus on a specific example when sample selection depends on race. We propose identification formulas to properly estimate the average controlled difference (ACD) in outcomes between Black and White individuals, with appropriate weighting for covariate imbalance across the two racial groups and generalizability. Via extensive simulation, we show that our proposed methods outperform traditional analytic approaches in terms of bias, mean squared error, and coverage. We are motivated by the interplay between race and social determinants of health when estimating racial differences in telomere length using data from the National Health and Nutrition Examination Survey. We build a propensity for race to properly adjust for other social determinants while characterizing the controlled effect of race on telomere length. We find that evidence of racial differences in telomere length between Black and White individuals attenuates after accounting for confounding by socioeconomic factors and after utilizing appropriate propensity score and survey weighting techniques. Software to implement these methods can be found in the R package svycdiff at https://github.com/salernos/svycdiff.
Problem

Research questions and friction points this paper is trying to address.

Estimating racial disparities in telomere length
Addressing complex survey weighting challenges
Properly adjusting for socioeconomic confounding factors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Propensity score methods for covariate balance
Identification formulas for outcome differences
Combining survey weights with propensity scores
🔎 Similar Papers
No similar papers found.
S
Stephen Salerno
Division of Public Health Sciences, Biostatistics, Fred Hutchinson Cancer Center, Seattle, WA
E
Emily K. Roberts
Department of Biostatistics, University of Iowa, Iowa City, IA
B
Belinda L. Needham
Department of Epidemiology, University of Michigan, Ann Arbor, MI
Tyler H. McCormick
Tyler H. McCormick
University of Washington
statisticsdata scienceBayesian modelingsocial networksglobal health
B
B. Mukherjee
Department of Biostatistics, Department of Epidemiology, Department of Statistics and Data Science, Yale University, New Haven, CT
Xu Shi
Xu Shi
University of Michigan
Electronic Health RecordCausal InferenceNegative ControlMachine Translation