🤖 AI Summary
To address the challenge of effectively fusing echocardiographic time-series data with clinical tabular data for hypertension diagnosis in cardiac assessment, this paper proposes a decoupled asymmetric multimodal fusion framework. Echocardiography serves as the primary modality, while clinical tabular data act as the auxiliary modality. A feature disentanglement mechanism explicitly separates shared (cross-modal) representations from modality-specific features, and a cross-modal alignment strategy enables synergistic modeling of heterogeneous data. Evaluated on a cohort of 239 patients, the method achieves an AUC of 90.3%, significantly outperforming existing multimodal fusion approaches. Moreover, it offers strong interpretability—via disentangled feature attribution—and robust generalization across diverse clinical subgroups. This work provides a clinically deployable, principled solution for multimodal decision support in cardiovascular diagnostics.
📝 Abstract
Multimodal data fusion is a key approach for enhancing diagnosis in medical applications. We propose an asymmetric fusion strategy starting from a primary modality and integrating secondary modalities by disentangling shared and modality-specific information. Validated on a dataset of 239 patients with echocardiographic time series and tabular records, our model outperforms existing methods, achieving an AUC over 90%. This improvement marks a crucial benchmark for clinical use.