An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains

📅 2024-10-05
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
ECGFoundation addresses core challenges in ECG foundation modeling—namely, limited annotated data, poor cross-domain generalization, and substantial performance gaps between single- and multi-lead ECGs. It is the first large-scale ECG foundation model trained on over ten million real-world clinical ECG recordings with expert annotations across 150 diagnostic classes. The architecture combines self-supervised pretraining with supervised fine-tuning, incorporating multi-label classification, domain adaptation, and lightweight inference design. Crucially, ECGFoundation achieves strong cross-hospital and cross-device generalization, breaks the performance ceiling of single-lead ECG analysis, and natively supports low-rank and arbitrary single-lead inputs—enabling deployment in mobile monitoring scenarios. Internal validation shows AUROC > 0.95 for 80 diagnoses; external multicenter evaluation confirms robust generalization. Downstream fine-tuning outperforms existing baselines on population-level analysis, clinical event detection, and cross-modal rhythm diagnosis. Both model and data are open-sourced to advance the community.

Technology Category

Application Category

📝 Abstract
Artificial intelligence (AI) has demonstrated significant potential in ECG analysis and cardiovascular disease assessment. Recently, foundation models have played a remarkable role in advancing medical AI. The development of an ECG foundation model holds the promise of elevating AI-ECG research to new heights. However, building such a model faces several challenges, including insufficient database sample sizes and inadequate generalization across multiple domains. Additionally, there is a notable performance gap between single-lead and multi-lead ECG analyses. We introduced an ECG Foundation Model (ECGFounder), a general-purpose model that leverages real-world ECG annotations from cardiology experts to broaden the diagnostic capabilities of ECG analysis. ECGFounder was trained on over 10 million ECGs with 150 label categories from the Harvard-Emory ECG Database, enabling comprehensive cardiovascular disease diagnosis through ECG analysis. The model is designed to be both an effective out-of-the-box solution, and a to be fine-tunable for downstream tasks, maximizing usability. Importantly, we extended its application to lower rank ECGs, and arbitrary single-lead ECGs in particular. ECGFounder is applicable to supporting various downstream tasks in mobile monitoring scenarios. Experimental results demonstrate that ECGFounder achieves expert-level performance on internal validation sets, with AUROC exceeding 0.95 for eighty diagnoses. It also shows strong classification performance and generalization across various diagnoses on external validation sets. When fine-tuned, ECGFounder outperforms baseline models in demographic analysis, clinical event detection, and cross-modality cardiac rhythm diagnosis. The trained model and data will be publicly released upon publication through the bdsp.io. Our code is available at https://github.com/PKUDigitalHealth/ECGFounder
Problem

Research questions and friction points this paper is trying to address.

Building a general-purpose ECG model for diverse cardiovascular diagnoses
Addressing limited sample sizes and cross-domain generalization in ECG AI
Bridging performance gap between single-lead and multi-lead ECG analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

ECG foundation model trained on 10M+ recordings
Supports single-lead and multi-lead ECG analysis
Fine-tunable for diverse downstream diagnostic tasks
🔎 Similar Papers
J
Jun Li
Beth Israel Deaconess Medical Center, Boston, MA, USA
A
Aaron Aguirre
Department of Cardiology, Massachusetts General Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA
J
Junior Moura
Department of Cardiology, Massachusetts General Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA
Che Liu
Che Liu
Imperial College London
Multimodal learningAI4Medicine
L
Lanhai Zhong
Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
C
Chenxi Sun
Harvard Medical School, Boston, MA, USA; Department of Neurology, Beth Israel Deaconess Medical Center, Boston, MA, USA
Gari Clifford
Gari Clifford
Professor of Biomed Eng & Biomed Inform, Emory University and Georgia Institute of Technology
Signal ProcessingMachine LearningmHealthAffordable Healthcare
B
Brandon Westover
Harvard Medical School, Boston, MA, USA; Department of Neurology, Beth Israel Deaconess Medical Center, Boston, MA, USA
Shenda Hong
Shenda Hong
Assistant Professor, Peking University
AI ECGBiosignalAI for Digital HealthHealth Data ScienceAI for Healthcare