ECG-NAT: A Self-supervised Neighborhood Attention Transformer for Multi-lead Electrocardiogram Classification

📅 2026-05-13

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This work addresses the challenges in electrocardiogram (ECG) arrhythmia classification—namely, high signal variability, strong noise interference, scarce labeled data, and the trade-off between accuracy and efficiency—by proposing ECG-NAT, a novel two-stage self-supervised learning framework. The method first employs a masked autoencoder for generative pretraining on multi-source ECG data, followed by discriminative fine-tuning that jointly optimizes supervised contrastive and cross-entropy losses. A key innovation is the introduction of a hierarchical neighborhood attention mechanism, which efficiently captures multiscale temporal features ranging from individual heartbeat morphology to global rhythm patterns. Experimental results demonstrate that ECG-NAT achieves 88.1% accuracy on standard benchmarks using only 1% of labeled data, maintaining superior classification performance while significantly reducing computational overhead, thereby making it well-suited for real-time ECG diagnostics.

📝 Abstract

Electrocardiogram (ECG) arrhythmia classification remains challenging due to signal variability, noise, limited labeled data, and the difficulty in achieving both accuracy and efficiency in models. While self-supervised learning reduces label dependency, most methods target either global contextual features or local morphological patterns, but rarely implement hierarchical multi-scale feature extraction. ECG signals require architectures that simultaneously capture fine-grained beat-level morphology and broader rhythm-level dependencies with computational efficiency. To overcome this limitation, this paper proposes the Electrocardiogram Neighborhood Attention Transformer (ECG-NAT), a novel self-supervised learning approach tailored for multi-lead ECG classification. Our two-stage approach begins with generative pretraining, using a masked autoencoder to reconstruct partially masked ECG signals across multiple diverse datasets, enabling the model to learn robust, domain-invariant representations from unlabeled data. This is followed by discriminative fine-tuning with a dual-loss function that combines supervised contrastive and cross-entropy losses, aligning representation learning with label prediction. The hierarchical attention mechanism efficiently captures multi-scale temporal features from localized beat morphology to broader rhythm patterns at low computational cost. ECG-NAT achieves robust performance on benchmark datasets, with 88.1\% accuracy using only 1\% labeled data, demonstrating strong efficacy in low-resource settings. The framework combines superior classification performance with computational efficiency, making it practical for real-time ECG diagnosis. The code will be made available upon acceptance at: https://github.com/Mahsagazeran/ECG-NAT.

Problem

Research questions and friction points this paper is trying to address.

ECG classification

arrhythmia detection

self-supervised learning

multi-scale feature extraction

limited labeled data

Innovation

Methods, ideas, or system contributions that make the work stand out.

self-supervised learning

neighborhood attention

multi-scale feature extraction