🤖 AI Summary
Electrocardiography (ECG) and phonocardiography (PCG) capture complementary cardiac electrophysiological and mechanical activities, yet the nature of their cross-modal shared versus modality-specific representations, biomarker transferability, and inter-subject generalizability remain poorly understood.
Method: We propose the first systematic disentanglement of shared and private representations between ECG and PCG. Leveraging instantaneous amplitude envelope modeling, we design a non-causal LSTM framework to reconstruct ECG waveforms—including clinically critical features such as QT interval—from PCG signals across physiological states (rest/exercise) and subjects.
Contribution/Results: Our method achieves significantly higher reconstruction fidelity than conventional approaches: QT interval estimation error is <15 ms, outperforming ECG-to-PCG reconstruction. Crucially, it demonstrates that core electrophysiological metrics can be robustly inferred from PCG alone—enabling contactless, low-burden cardiac functional assessment and establishing a novel paradigm for wearable and remote cardiovascular monitoring.
📝 Abstract
Simultaneous electrocardiography (ECG) and phonocardiogram (PCG) provide a comprehensive, multimodal perspective on cardiac function by capturing the heart's electrical and mechanical activities, respectively. However, the distinct and overlapping information content of these signals, as well as their potential for mutual reconstruction and biomarker extraction, remains incompletely understood, especially under varying physiological conditions and across individuals. In this study, we systematically investigate the common and exclusive characteristics of ECG and PCG using the EPHNOGRAM dataset of simultaneous ECG-PCG recordings during rest and exercise. We employ a suite of linear and nonlinear machine learning models, including non-causal LSTM networks, to reconstruct each modality from the other and analyze the influence of causality, physiological state, and cross-subject variability. Our results demonstrate that nonlinear models, particularly non-causal LSTM, provide superior reconstruction performance, with reconstructing ECG from PCG proving more tractable than the reverse. Exercise and cross-subject scenarios present significant challenges, but envelope-based modeling that utilizes instantaneous amplitude features substantially improves cross-subject generalizability for cross-modal learning. Furthermore, we demonstrate that clinically relevant ECG biomarkers, such as fiducial points and QT intervals, can be estimated from PCG in cross-subject settings. These findings advance our understanding of the relationship between electromechanical cardiac modalities, in terms of both waveform characteristics and the timing of cardiac events, with potential applications in novel multimodal cardiac monitoring technologies.