Fine-Grained ECG-Text Contrastive Learning via Waveform Understanding Enhancement

📅 2025-05-17

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Existing ECG–text contrastive learning methods struggle to model fine-grained waveform features and diagnostic reasoning due to the absence of explicit waveform descriptions in clinical reports. To address this, we propose the first contrastive learning framework explicitly designed for ECG waveform feature completion and semantic alignment: (1) leveraging large language models (LLMs) to invert and reconstruct missing waveform semantics in reports; (2) constructing a waveform–text semantic similarity matrix to guide fine-grained contrastive learning; and (3) introducing a sigmoid-based multi-label loss function tailored to weakly supervised, multi-sign, multi-diagnosis scenarios. Evaluated on six benchmark datasets, our method achieves state-of-the-art zero-shot transfer and linear probe performance—marking the first work to enable waveform-level semantic alignment for interpretable, representation learning in ECG diagnosis.

Technology Category

Application Category

📝 Abstract

Electrocardiograms (ECGs) are essential for diagnosing cardiovascular diseases. While previous ECG-text contrastive learning methods have shown promising results, they often overlook the incompleteness of the reports. Given an ECG, the report is generated by first identifying key waveform features and then inferring the final diagnosis through these features. Despite their importance, these waveform features are often not recorded in the report as intermediate results. Aligning ECGs with such incomplete reports impedes the model's ability to capture the ECG's waveform features and limits its understanding of diagnostic reasoning based on those features. To address this, we propose FG-CLEP (Fine-Grained Contrastive Language ECG Pre-training), which aims to recover these waveform features from incomplete reports with the help of large language models (LLMs), under the challenges of hallucinations and the non-bijective relationship between waveform features and diagnoses. Additionally, considering the frequent false negatives due to the prevalence of common diagnoses in ECGs, we introduce a semantic similarity matrix to guide contrastive learning. Furthermore, we adopt a sigmoid-based loss function to accommodate the multi-label nature of ECG-related tasks. Experiments on six datasets demonstrate that FG-CLEP outperforms state-of-the-art methods in both zero-shot prediction and linear probing across these datasets.

Problem

Research questions and friction points this paper is trying to address.

Enhancing ECG-text alignment by recovering missing waveform features from reports

Addressing hallucinations and non-bijective ECG-diagnosis relationships using LLMs

Improving contrastive learning for multi-label ECG tasks with semantic similarity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Recover waveform features using large language models

Introduce semantic similarity matrix for contrastive learning

Adopt sigmoid-based loss for multi-label ECG tasks

🔎 Similar Papers

Boosting Masked ECG-Text Auto-Encoders as Discriminative Learners