CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models

📅 2025-12-01

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Existing ECG foundation models neglect clinical metadata during unsupervised pretraining, limiting their diagnostic capability. Method: We propose a clinical risk score–guided contrastive learning framework—first introducing risk scores for adaptive negative-sample weighting—and design an explicit modeling mechanism for missing metadata. Integrated with a multi-scale Transformer architecture, our method enables self-supervised pretraining on large-scale 12-lead and single-lead ECG data without per-sample annotations. Results: Evaluated across seven independent datasets and 18 tasks, our medium-sized CLEF model achieves an average AUROC gain of ≥2.6% over self-supervised baselines for classification and an average MAE reduction of 3.2% for regression. Notably, the single-lead pretraining variant matches or exceeds the performance of fully supervised state-of-the-art methods (e.g., ECGFounder) on multiple tasks, significantly improving accuracy and clinical generalizability in remote ECG analysis.

Technology Category

Application Category

📝 Abstract

The electrocardiogram (ECG) is a key diagnostic tool in cardiovascular health. Single-lead ECG recording is integrated into both clinical-grade and consumer wearables. While self-supervised pretraining of foundation models on unlabeled ECGs improves diagnostic performance, existing approaches do not incorporate domain knowledge from clinical metadata. We introduce a novel contrastive learning approach that utilizes an established clinical risk score to adaptively weight negative pairs: clinically-guided contrastive learning. It aligns the similarities of ECG embeddings with clinically meaningful differences between subjects, with an explicit mechanism to handle missing metadata. On 12-lead ECGs from 161K patients in the MIMIC-IV dataset, we pretrain single-lead ECG foundation models at three scales, collectively called CLEF, using only routinely collected metadata without requiring per-sample ECG annotations. We evaluate CLEF on 18 clinical classification and regression tasks across 7 held-out datasets, and benchmark against 5 foundation model baselines and 3 self-supervised algorithms. When pretrained on 12-lead ECG data and tested on lead-I data, CLEF outperforms self-supervised foundation model baselines: the medium-sized CLEF achieves average AUROC improvements of at least 2.6% in classification and average reductions in MAEs of at least 3.2% in regression. Comparing with existing self-supervised learning algorithms, CLEF improves the average AUROC by at least 1.8%. Moreover, when pretrained only on lead-I data for classification tasks, CLEF performs comparably to the state-of-the-art ECGFounder, which was trained in a supervised manner. Overall, CLEF enables more accurate and scalable single-lead ECG analysis, advancing remote health monitoring. Code and pretrained CLEF models are available at: github.com/Nokia-Bell-Labs/ecg-foundation-model.

Problem

Research questions and friction points this paper is trying to address.

Develops contrastive learning using clinical risk scores for ECG foundation models.

Handles missing metadata while aligning ECG embeddings with clinical differences.

Improves single-lead ECG analysis accuracy for remote health monitoring.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrastive learning uses clinical risk scores to weight negative pairs

Aligns ECG embeddings with clinically meaningful differences between subjects

Handles missing metadata explicitly during pretraining

🔎 Similar Papers

ECG-FM: An Open Electrocardiogram Foundation Model