An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains

📅 2024-10-05

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

career value

236K/year

🤖 AI Summary

ECGFoundation addresses core challenges in ECG foundation modeling—namely, limited annotated data, poor cross-domain generalization, and substantial performance gaps between single- and multi-lead ECGs. It is the first large-scale ECG foundation model trained on over ten million real-world clinical ECG recordings with expert annotations across 150 diagnostic classes. The architecture combines self-supervised pretraining with supervised fine-tuning, incorporating multi-label classification, domain adaptation, and lightweight inference design. Crucially, ECGFoundation achieves strong cross-hospital and cross-device generalization, breaks the performance ceiling of single-lead ECG analysis, and natively supports low-rank and arbitrary single-lead inputs—enabling deployment in mobile monitoring scenarios. Internal validation shows AUROC > 0.95 for 80 diagnoses; external multicenter evaluation confirms robust generalization. Downstream fine-tuning outperforms existing baselines on population-level analysis, clinical event detection, and cross-modal rhythm diagnosis. Both model and data are open-sourced to advance the community.

Technology Category

Application Category

📝 Abstract

Artificial intelligence (AI) has demonstrated significant potential in ECG analysis and cardiovascular disease assessment. Recently, foundation models have played a remarkable role in advancing medical AI. The development of an ECG foundation model holds the promise of elevating AI-ECG research to new heights. However, building such a model faces several challenges, including insufficient database sample sizes and inadequate generalization across multiple domains. Additionally, there is a notable performance gap between single-lead and multi-lead ECG analyses. We introduced an ECG Foundation Model (ECGFounder), a general-purpose model that leverages real-world ECG annotations from cardiology experts to broaden the diagnostic capabilities of ECG analysis. ECGFounder was trained on over 10 million ECGs with 150 label categories from the Harvard-Emory ECG Database, enabling comprehensive cardiovascular disease diagnosis through ECG analysis. The model is designed to be both an effective out-of-the-box solution, and a to be fine-tunable for downstream tasks, maximizing usability. Importantly, we extended its application to lower rank ECGs, and arbitrary single-lead ECGs in particular. ECGFounder is applicable to supporting various downstream tasks in mobile monitoring scenarios. Experimental results demonstrate that ECGFounder achieves expert-level performance on internal validation sets, with AUROC exceeding 0.95 for eighty diagnoses. It also shows strong classification performance and generalization across various diagnoses on external validation sets. When fine-tuned, ECGFounder outperforms baseline models in demographic analysis, clinical event detection, and cross-modality cardiac rhythm diagnosis. The trained model and data will be publicly released upon publication through the bdsp.io. Our code is available at https://github.com/PKUDigitalHealth/ECGFounder

Problem

Research questions and friction points this paper is trying to address.

Building a general-purpose ECG model for diverse cardiovascular diagnoses

Addressing limited sample sizes and cross-domain generalization in ECG AI

Bridging performance gap between single-lead and multi-lead ECG analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

ECG foundation model trained on 10M+ recordings

Supports single-lead and multi-lead ECG analysis

Fine-tunable for diverse downstream diagnostic tasks

🔎 Similar Papers

ECG-FM: An Open Electrocardiogram Foundation Model