Multimodal Lead-Specific Modeling of ECG for Low-Cost Pulmonary Hypertension Assessment

📅 2025-03-03
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Early screening for pulmonary hypertension (PH) in resource-limited settings is hindered by reliance on centralized 12-lead electrocardiography (ECG), while portable 6-lead ECG—despite its accessibility—suffers from severe scarcity of labeled data, impeding robust model development. Method: We propose a Lead-Specific Ensemble Multimodal Variational Autoencoder (LS-EMVAE), treating each ECG lead as an independent modality. It hierarchically fuses lead-specific and shared representations via mixture-of-experts and product-of-experts mechanisms, integrated within a transfer learning paradigm: pretraining on 12L-ECG followed by fine-tuning on 6L-ECG. Contribution/Results: Evaluated on 892 PH detection and 691 PH subtyping cases, LS-EMVAE significantly outperforms all baselines. Notably, its 6L-ECG performance matches that of 12L-ECG, demonstrating clinical feasibility of low-cost, portable PH screening and subtyping—enabling scalable deployment in underserved regions.

Technology Category

Application Category

📝 Abstract
Pulmonary hypertension (PH) is frequently underdiagnosed in low- and middle-income countries (LMICs) primarily due to the scarcity of advanced diagnostic tools. Several studies in PH have applied machine learning to low-cost diagnostic tools like 12-lead ECG (12L-ECG), but they mainly focus on areas with limited resources, overlooking areas with no diagnostic tools, such as rural primary healthcare in LMICs. Recent studies have shown the effectiveness of 6-lead ECG (6L-ECG), as a cheaper and portable alternative in detecting various cardiac conditions, but its clinical value for PH detection is not well proved. Furthermore, existing methods treat 12L-/6L-ECG as a single modality, capturing only shared features while overlooking lead-specific features essential for identifying complex cardiac hemodynamic changes. In this paper, we propose Lead-Specific Electrocardiogram Multimodal Variational Autoencoder (LS-EMVAE), a model pre-trained on large-population 12L-ECG data and fine-tuned on task-specific data (12L-ECG or 6L-ECG). LS-EMVAE models each 12L-ECG lead as a separate modality and introduces a hierarchical expert composition using Mixture and Product of Experts for adaptive latent feature fusion between lead-specific and shared features. Unlike existing approaches, LS-EMVAE makes better predictions on both 12L-ECG and 6L-ECG at inference, making it an equitable solution for areas with limited or no diagnostic tools. We pre-trained LS-EMVAE on 800,000 publicly available 12L-ECG samples and fine-tuned it for two tasks: 1) PH detection and 2) phenotyping pre-/post-capillary PH, on in-house datasets of 892 and 691 subjects across 12L-ECG and 6L-ECG settings. Extensive experiments show that LS-EMVAE outperforms existing baselines in both ECG settings, while 6L-ECG achieves performance comparable to 12L-ECG, unlocking its potential for global PH screening in areas without diagnostic tools.
Problem

Research questions and friction points this paper is trying to address.

Early pulmonary hypertension assessment in decentralized clinical settings
Overcoming data scarcity for 6-lead ECG model development
Improving multimodal fusion of ECG leads for PH detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical expert fusion mechanism for adaptive latent fusion
Transfer learning from 12-lead to 6-lead ECG data
Latent representation alignment loss for coherence improvement
🔎 Similar Papers
No similar papers found.
M
Mohammod N. I. Suvon
School of Computer Science, University of Sheffield, S1 4DP , Sheffield, U.K.
S
Shuo Zhou
School of Computer Science, University of Sheffield, S1 4DP , Sheffield, U.K.
P
Prasun C Tripathi
School of Computer Science, University of Sheffield, S1 4DP , Sheffield, U.K.
Wenrui Fan
Wenrui Fan
AI Research Engineer, The University of Sheffield
Multi-modal AISelf-supervised learningComputer Vision
Samer Alabed
Samer Alabed
Professor of Biomedical Engineering, German Jordanian University, Jordan .
Signal ProcessingWireless communicationsBiomedical EngineeringIOT/IOMT
Bishesh Khanal
Bishesh Khanal
NAAMII: Nepal Applied Mathematics and Informatics Institute for research
Artificial IntelligenceMedical Imaging InformaticsComputer VisionNLP low-resource languages
Venet Osmani
Venet Osmani
Professor of Clinical AI and Machine Learning, Queen Mary University of London
machine learningmedicine
A
Andrew J Swift
School of Medicine and Population Health, University of Sheffield, S10 2TN Sheffield, Department of Clinical Radiology, Sheffield Teaching Hospitals, S10 2JF , Sheffield, U.K, and National Institute for Health and Care Research (NIHR), Sheffield Biomedical Research Centre, S10 2JF , Sheffield, U.K.
C
Chen Chen
School of Computer Science, University of Sheffield, S1 4DP , Sheffield, U.K. and Department of Computing, Imperial College London, SW7 2AZ, London, U.K.
Haiping Lu
Haiping Lu
Professor of Machine Learning, University of Sheffield
Machine learningMultimodal AIAI4HealthAI4ScienceOpen-source software