Refining capture-recapture methods to estimate case counts in a finite population setting

📅 2025-10-31

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Capture-recapture estimation in finite, closed populations—such as disease surveillance data containing only positive test results—is severely biased due to non-representative capture sources. Method: We propose two finite-population correction strategies and construct corrected Bayesian credible intervals by integrating a non-representative primary data stream with a representative random-sample anchor dataset, incorporating theoretical modeling and finite-population adjustments to improve unbiasedness and precision. Contribution/Results: This work is the first to systematically address non-representative capture-source bias in finite populations, providing computationally tractable analytical correction formulas and a unified Bayesian inference framework. Simulation studies show that our method reduces mean squared error by an average of 37%. Applied to real-world breast cancer recurrence surveillance data, it yields case estimates closer to independent validation benchmarks, and improves credible interval coverage from 68% to 94%.

Technology Category

Application Category

📝 Abstract

In this paper, we expand upon and refine a monitoring strategy proposed for surveillance of diseases in finite, closed populations. This monitoring strategy consists of augmenting an arbitrarily non-representative data stream (such as a voluntary flu testing program) with a random sample (referred to as an "anchor stream"). This design allows for the use of traditional capture-recapture (CRC) estimators, as well as recently proposed anchor stream estimators that more efficiently utilize the data. Here, we focus on a particularly common situation in which the first data stream only records positive test results, while the anchor stream documents both positives and negatives. Due to the non-representative nature of the first data stream, along with the fact that inference is being performed on a finite, closed population, there are standard and non-standard finite population effects at play. Here, we propose two methods of incorporating finite population corrections (FPCs) for inference, along with an FPC-adjusted Bayesian credible interval. We compare these approaches with existing methods through simulation and demonstrate that the FPC adjustments can lead to considerable gains in precision. Finally, we provide a real data example by applying these methods to estimating the breast cancer recurrence count among Metro Atlanta-area patients in the Georgia Cancer Registry-based Cancer Recurrence Information and Surveillance Program (CRISP) database.

Problem

Research questions and friction points this paper is trying to address.

Refining capture-recapture methods for finite population disease surveillance

Developing finite population corrections for non-representative data streams

Improving precision in estimating disease counts through anchor sampling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Augmenting non-representative data with random anchor stream

Applying finite population corrections for improved inference

Developing FPC-adjusted Bayesian credible intervals for precision

🔎 Similar Papers

No similar papers found.