🤖 AI Summary
Existing self-supervised approaches treat electrocardiograms (ECGs) as generic time series, overlooking their intrinsic physiological rhythms and semantic structure, thereby limiting disease detection performance. This work proposes RhythmBERT, the first method to model ECGs as structured language. It employs an autoencoder to map P, QRS, and T wave segments into discrete symbolic tokens to capture rhythmic semantics while preserving continuous embeddings to retain morphological details. The model is pretrained via masked language modeling on 800,000 unlabeled single-lead ECGs. Remarkably, RhythmBERT achieves or surpasses the performance of 12-lead baseline models using only a single lead across multiple diagnostic tasks—including atrial fibrillation, ST-T abnormalities, and myocardial infarction—demonstrating significantly enhanced generalization capability.
📝 Abstract
Electrocardiogram (ECG) analysis is crucial for diagnosing heart disease, but most self-supervised learning methods treat ECG as a generic time series, overlooking physiologic semantics and rhythm-level structure. Existing contrastive methods utilize augmentations that distort morphology, whereas generative approaches employ fixed-window segmentation, which misaligns cardiac cycles. To address these limitations, we propose RhythmBERT, a generative ECG language model that considers ECG as a language paradigm by encoding P, QRS, and T segments into symbolic tokens via autoencoder-based latent representations. These discrete tokens capture rhythm semantics, while complementary continuous embeddings retain fine-grained morphology, enabling a unified view of waveform structure and rhythm. RhythmBERT is pretrained on approximately 800,000 unlabeled ECG recordings with a masked prediction objective, allowing it to learn contextual representations in a label-efficient manner. Evaluations show that despite using only a single lead, RhythmBERT achieves comparable or superior performance to strong 12-lead baselines. This generalization extends from prevalent conditions such as atrial fibrillation to clinically challenging cases such as subtle ST-T abnormalities and myocardial infarction. Our results suggest that considering ECG as structured language offers a scalable and physiologically aligned pathway for advancing cardiac analysis.