On linkage bias-correction for estimators using iterated bootstraps

📅 2025-11-07

📈 Citations: 0

✨ Influential: 0

career value

271K/year

🤖 AI Summary

Probabilistic record linkage introduces linkage errors due to the absence of error-free linking variables, leading to biased downstream statistical inference. To address this, we propose an Iterative Bootstrap–based framework for linkage bias correction—the first systematic application of iterative bootstrap to bias mitigation in multi-source data integration. Our method comprises probabilistic linkage, iterative resampling, construction of bias-corrected estimators, and associated statistical testing. Crucially, we introduce a novel variance–bias trade-off diagnostic test that automatically detects when further iterations inflate variance without reducing bias, thereby enhancing estimator robustness. Experiments on simulated hormonal data and real-world linked administrative data from the Australian Bureau of Statistics’ Labour Mobility Survey demonstrate that our approach significantly reduces linkage-induced bias while effectively constraining variance inflation.

Technology Category

Application Category

📝 Abstract

By amalgamating data from disparate sources, the resulting integrated dataset becomes a valuable resource for statistical analysis. In probabilistic record linkage, the effectiveness of such integration relies on the availability of linkage variables free from errors. Where this is lacking, the linked data set would suffer from linkage errors and the resultant analyses, linkage bias. This paper proposes a methodology leveraging the bootstrap technique to devise linkage bias-corrected estimators. Additionally, it introduces a test to assess whether increasing the number of bootstrap iterations meaningfully reduces linkage bias or merely inflates variance without further improving accuracy. An application of these methodologies is demonstrated through the analysis of a simulated dataset featuring hormone information, along with a dataset obtained from linking two data sets from the Australian Bureau of Statistics'labour mobility surveys.

Problem

Research questions and friction points this paper is trying to address.

Correcting linkage bias in estimators using bootstrap techniques

Testing if more bootstrap iterations meaningfully reduce bias

Addressing linkage errors when integrating disparate data sources

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses bootstrap technique for bias correction

Introduces test for bootstrap iteration effectiveness

Applies methods to simulated and real datasets

🔎 Similar Papers

No similar papers found.