Statistical method for pooling categorical biomarkers from multi-center matched/nested case-control studies

📅 2025-05-04

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

In multicenter matched/nested case-control studies, categorical biomarkers are prone to regression estimation bias due to inter-center measurement heterogeneity, assay platform differences, and laboratory variability. To address this, we propose a likelihood-based calibration-integration method that jointly models the biomarker–disease association while explicitly correcting for center-specific measurement error. Our approach innovatively embeds the calibration step directly within the primary statistical model and employs sandwich-type robust variance estimation, ensuring asymptotically unbiased parameter inference and nominal coverage under multiple sources of uncertainty. Simulation studies demonstrate consistently strong statistical properties—including bias reduction, efficiency, and valid inference—across varying sample sizes and effect magnitudes. We validate the method using real multicenter data on vitamin D and colorectal cancer, confirming its practical feasibility and robustness in complex epidemiological settings.

Technology Category

Application Category

📝 Abstract

Pooled analyses that aggregate data from multiple studies are becoming increasingly common in collaborative epidemiologic research in order to increase the size and diversity of the study population. However, biomarker measurements from different studies are subject to systematic measurement errors and directly pooling them for analyses may lead to biased estimates of the regression parameters. Therefore, study-specific calibration processes must be incorporated in the statistical analyses to address between-study/assay/laboratory variability in the biomarker measurements. We propose a likelihood-based method to evaluate biomarker-disease relationships for categorical biomarkers in matched/nested case-control studies. To account for the additional uncertainties from the calibration processes, we propose a sandwich variance estimator to obtain valid asymptotic variances of the estimated regression parameters. Extensive simulation studies with varying sample sizes and biomarker-disease associations are used to evaluate the finite sample performance of our proposed methods. As an illustration, we apply the methods to a vitamin D pooling project of colorectal cancer to evaluate the effect of categorical vitamin D levels on colorectal cancer risks.

Problem

Research questions and friction points this paper is trying to address.

Addressing systematic errors in multi-center biomarker pooling

Developing calibration methods for between-study assay variability

Evaluating categorical biomarker-disease relationships in case-control studies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Likelihood-based method for categorical biomarkers

Sandwich variance estimator for calibration uncertainties

Study-specific calibration for between-study variability

🔎 Similar Papers

Bayesian Meta-Learning for Improving Generalizability of Health Prediction Models With Similar Causal Mechanisms