🤖 AI Summary
This study addresses the critical question: “Can foundation models effectively empower ECG analysis?” To this end, we introduce ECG-Bench—the first multi-task benchmark specifically designed for electrocardiography—enabling the first systematic evaluation of large language models, general-purpose time-series foundation models, and ECG-specific foundation models across clinically relevant tasks, including 12-lead classification, arrhythmia detection, and waveform reconstruction. Our methodology integrates self-supervised pretraining, multi-task fine-tuning, and cross-modal time-series modeling, under a standardized evaluation protocol. Results demonstrate that general-purpose time-series foundation models, trained solely on unlabeled ECG data, achieve 80% of the performance of state-of-the-art supervised models—substantially reducing reliance on expert-labeled data—while attaining state-of-the-art performance on 80% of evaluated tasks. All code and datasets are publicly released to advance trustworthy, transparent AI for physiological signal analysis.
📝 Abstract
In the process of patient diagnosis, non-invasive measurements are widely used due to their low risks and quick results. Electrocardiogram (ECG), as a non-invasive method to collect heart activities, is used to diagnose cardiac conditions. Analyzing the ECG typically requires domain expertise, which is a roadblock to applying artificial intelligence (AI) for healthcare. Through advances in self-supervised learning and foundation models, AI systems can now acquire and leverage domain knowledge without relying solely on human expertise. However, there is a lack of comprehensive analyses over the foundation models' performance on ECG. This study aims to answer the research question: "Are Foundation Models Useful for ECG Analysis?" To address it, we evaluate language / general time-series / ECG foundation models in comparison with time-series deep learning models. The experimental results show that general time-series / ECG foundation models achieve a top performance rate of 80%, indicating their effectiveness in ECG analysis. In-depth analyses and insights are provided along with comprehensive experimental results. This study highlights the limitations and potential of foundation models in advancing physiological waveform analysis. The code and data for this benchmark are publicly available at https://github.com/yuhaoxu99/ECGMultitasks-Benchmark.