An Electrocardiogram Multi-Task Benchmark with Comprehensive Evaluations and Insightful Findings.

📅 2025-08-07

🏛️ Studies in Health Technology and Informatics

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses the critical question: “Can foundation models effectively empower ECG analysis?” To this end, we introduce ECG-Bench—the first multi-task benchmark specifically designed for electrocardiography—enabling the first systematic evaluation of large language models, general-purpose time-series foundation models, and ECG-specific foundation models across clinically relevant tasks, including 12-lead classification, arrhythmia detection, and waveform reconstruction. Our methodology integrates self-supervised pretraining, multi-task fine-tuning, and cross-modal time-series modeling, under a standardized evaluation protocol. Results demonstrate that general-purpose time-series foundation models, trained solely on unlabeled ECG data, achieve 80% of the performance of state-of-the-art supervised models—substantially reducing reliance on expert-labeled data—while attaining state-of-the-art performance on 80% of evaluated tasks. All code and datasets are publicly released to advance trustworthy, transparent AI for physiological signal analysis.

Technology Category

Application Category

📝 Abstract

In the process of patient diagnosis, non-invasive measurements are widely used due to their low risks and quick results. Electrocardiogram (ECG), as a non-invasive method to collect heart activities, is used to diagnose cardiac conditions. Analyzing the ECG typically requires domain expertise, which is a roadblock to applying artificial intelligence (AI) for healthcare. Through advances in self-supervised learning and foundation models, AI systems can now acquire and leverage domain knowledge without relying solely on human expertise. However, there is a lack of comprehensive analyses over the foundation models' performance on ECG. This study aims to answer the research question: "Are Foundation Models Useful for ECG Analysis?" To address it, we evaluate language / general time-series / ECG foundation models in comparison with time-series deep learning models. The experimental results show that general time-series / ECG foundation models achieve a top performance rate of 80%, indicating their effectiveness in ECG analysis. In-depth analyses and insights are provided along with comprehensive experimental results. This study highlights the limitations and potential of foundation models in advancing physiological waveform analysis. The code and data for this benchmark are publicly available at https://github.com/yuhaoxu99/ECGMultitasks-Benchmark.

Problem

Research questions and friction points this paper is trying to address.

Evaluating foundation models for ECG analysis effectiveness

Comparing AI models against time-series deep learning methods

Assessing limitations and potential of foundation models in ECG

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates foundation models for ECG analysis

Compares language and time-series foundation models

Provides benchmark with public data and code

🔎 Similar Papers

No similar papers found.

Authors to Follow