A Closer Look at Mortality Risk Prediction from Electrocardiograms

📅 2024-06-24

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This study addresses ECG-driven all-cause mortality prediction via survival analysis. Method: We systematically evaluate over 500 AI models—including deep survival models (e.g., DeepHit, PCHazard) and conventional classifiers—across three real-world clinical datasets: Code-15, MIMIC-IV, and BCH. We analyze the impact of temporal window selection, demographic covariates, and external-data fine-tuning on generalizability. Contribution/Results: Deep survival models significantly outperform traditional classifiers (C-index gain: 0.04–0.08). Temporal window optimization exhibits strong site-specific dependence. Incorporating demographic covariates and external-data fine-tuning markedly improves cross-center generalization. The best-performing model achieves median C-indices of 0.82, 0.78, and 0.76 on Code-15, MIMIC-IV, and BCH, respectively. Notably, demographic-only models demonstrate unexpectedly strong performance in low-mortality settings. Quantitative analysis reveals cross-center generalization decay ranging from ΔC-index = −0.03 to −0.24.

Technology Category

Application Category

📝 Abstract

Several recent studies combine large private ECG databases with AI to predict patient mortality. These studies typically use a few, highly variable, modeling approaches. While benchmarking these approaches has historically been limited by a lack of public ECG datasets, this changed with the 2023 release of MIMIC-IV, containing 795,546 ECGs from a U.S. hospital system, and the 2020 release of Code-15, containing 345,779 ECGs collected during routine care in Brazil. We benchmark over 500 AI-ECG survival models predicting all-cause mortality on Code-15 and MIMIC-IV with 2 neural architectures, 4 Deep-Survival-Analysis approaches, and classifiers predicting mortality at 4 time horizons. We extend the highest-performing approach to a dataset from Boston Children's Hospital (BCH, 225,379 ECGs). Models train with and without demographics (age/sex) and evaluate across datasets. The best performing Deep-Survival-Analysis models trained with ECG and demographics yield good median Concordance Indices (Code-15: 0.82, MIMIC-IV: 0.78, BCH: 0.76) and AUPRC scores (median 1-yr/5-yr, Code-15: 0.07/0.15; MIMIC-IV: 0.45/0.55; BCH: 0.04/0.13) considering the percentage of ECGs linked to mortality (1-yr/5-yr, Code-15: 1.2%/3.4%; MIMIC-IV: 14.8%/24.5%; BCH: 0.9%/4.8%). Contrasting with Deep-Survival-Analysis models, classifier-based AI-ECG models exhibit significant, site-dependent sensitivity to the choice of time horizon (median Pearson's R, Code-15: 0.69, p<1E-5; MIMIC-IV: -0.80 p<1E-5). Demographic-only models perform surprisingly well on Code-15. Concordance drops 0.03-0.24 on external validation. We recommend Deep-Survival-Analysis over Classifier-Cox approaches and the inclusion of demographic covariates in ECG survival modeling. Comparisons to demographic-only and baseline models is crucial. External evaluations support fine-tuning models on site-specific data.

Problem

Research questions and friction points this paper is trying to address.

Benchmark AI-ECG models for mortality prediction

Compare Deep-Survival-Analysis vs. Classifier approaches

Evaluate model performance with demographic data inclusion

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-ECG models benchmarked

Deep-Survival-Analysis approaches utilized

Demographics enhance ECG predictions

🔎 Similar Papers

No similar papers found.