LAVA: Language Model Assisted Verbal Autopsy for Cause-of-Death Determination

📅 2025-09-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In resource-limited settings, the absence of formal medical certification severely compromises the accuracy of verbal autopsy (VA) in determining cause of death. To address this, we propose an automated VA method integrating large language models (LLMs) with traditional algorithms. Our approach uniquely combines an off-the-shelf LLM (GPT-5), text embedding models, the LCVA baseline, and a meta-learning ensemble strategy into a multimodal VA analytical framework. Evaluated on the PHMRC gold-standard dataset, it achieves mean diagnostic accuracies of 48.6%, 50.5%, and 53.5% for adult, child, and neonatal subgroups, respectively—surpassing conventional methods by 5–10 percentage points. These results demonstrate that an LLM-driven, lightweight ensemble paradigm significantly enhances the practicality and generalizability of VA for public health surveillance in low-resource contexts.

Technology Category

Application Category

📝 Abstract
Verbal autopsy (VA) is a critical tool for estimating causes of death in resource-limited settings where medical certification is unavailable. This study presents LA-VA, a proof-of-concept pipeline that combines Large Language Models (LLMs) with traditional algorithmic approaches and embedding-based classification for improved cause-of-death prediction. Using the Population Health Metrics Research Consortium (PHMRC) dataset across three age categories (Adult: 7,580; Child: 1,960; Neonate: 2,438), we evaluate multiple approaches: GPT-5 predictions, LCVA baseline, text embeddings, and meta-learner ensembles. Our results demonstrate that GPT-5 achieves the highest individual performance with average test site accuracies of 48.6% (Adult), 50.5% (Child), and 53.5% (Neonate), outperforming traditional statistical machine learning baselines by 5-10%. Our findings suggest that simple off-the-shelf LLM-assisted approaches could substantially improve verbal autopsy accuracy, with important implications for global health surveillance in low-resource settings.
Problem

Research questions and friction points this paper is trying to address.

Improving verbal autopsy accuracy for cause-of-death determination
Combining LLMs with traditional methods for death prediction
Enhancing global health surveillance in low-resource settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combining LLMs with traditional algorithms
Using text embeddings for classification
Employing meta-learner ensembles approach
🔎 Similar Papers
No similar papers found.
Y
Yiqun T. Chen
Departments of Biostatistics and Computer Science, Johns Hopkins University
Tyler H. McCormick
Tyler H. McCormick
University of Washington
statisticsdata scienceBayesian modelingsocial networksglobal health
L
Li Liu
Departments of Population, Family and Reproductive Health and International Health, Johns Hopkins University
Abhirup Datta
Abhirup Datta
Professor, Biostatistics, Bloomberg School of Public Health, Johns Hopkins University
Spatial statisticsGaussian ProcessesBayesian hierarchical modelingAir PollutionGlobal Health