🤖 AI Summary
This study addresses the systematic and country-heterogeneous misclassification bias inherent in computer-coded verbal autopsy (CCVA) algorithms when estimating cause-specific mortality fractions (CSMFs). To correct this, the authors propose a modular Bayesian calibration framework that leverages CHAMPS project data to construct country-specific misclassification matrices with quantified uncertainty, thereby adjusting CCVA outputs. This approach represents the first scalable and generalizable method for calibrating discrete classifiers in verbal autopsy analysis, accommodating multiple algorithms—including InSilicoVA, InterVA, and EAVA—and ensuring compatibility with the openVA ecosystem. Validation on real-world datasets from COMSA-Mozambique and CA CODE demonstrates substantial improvements in CSMF estimation accuracy, offering a robust tool for global health monitoring.
📝 Abstract
Accurate estimation of cause-specific mortality fractions (CSMFs), the percentage of deaths attributable to each cause in a population, is essential for global health monitoring. Challenge arises because computer-coded verbal autopsy (CCVA) algorithms, commonly used to estimate CSMFs, frequently misclassify the cause of death (COD). This misclassification is further complicated by structured patterns and substantial variation across countries. To address this, we introduce the R package 'vacalibration'. It implements a modular Bayesian framework to correct for the misclassification, thereby yielding more accurate CSMF estimates from verbal autopsy (VA) questionnaire data.
The package utilizes uncertainty-quantified CCVA misclassification matrix estimates derived from data collected in the CHAMPS project and available on the 'CCVA-Misclassification-Matrices' GitHub repository. Currently, these matrices cover three CCVA algorithms (EAVA, InSilicoVA, and InterVA) and two age groups (neonates aged 0-27 days, and children aged 1-59 months) across countries (specific estimates for Bangladesh, Ethiopia, Kenya, Mali, Mozambique, Sierra Leone, and South Africa, and a combined estimate for all other countries), enabling global calibration. The 'vacalibration' package also supports ensemble calibration when multiple algorithms are available.
Implemented using the 'RStan', the package offers rapid computation, uncertainty quantification, and seamless compatibility with openVA, a leading COD analysis software ecosystem. We demonstrate the package's flexibility with two real-world applications in COMSA-Mozambique and CA CODE. The package and its foundational methodology applies more broadly and can calibrate any discrete classifier or their ensemble.