๐ค AI Summary
This work addresses the multi-level incompleteness in multimodal electronic health records (EHRs)โarising from irregular sampling, missing modalities, and sparse labelsโby introducing a clinical point cloud paradigm. It unifies heterogeneous clinical events as points in a continuous 4D space defined by content, time, modality, and patient identity. A low-rank relational attention mechanism captures high-order dependencies among arbitrary events, enabling fine-grained self-supervised learning without rigid alignment through hierarchical interactions and an efficient sampling strategy. The proposed method substantially outperforms existing models on large-scale EHR-based risk prediction tasks, demonstrating exceptional performance and robustness, particularly under high rates of missing data.
๐ Abstract
Deep learning-based modeling of multimodal Electronic Health Records (EHRs) has become an important approach for clinical diagnosis and risk prediction. However, due to diverse clinical workflows and privacy constraints, raw EHRs are inherently multi-level incomplete, including irregular sampling, missing modalities, and sparse labels. These issues cause temporal misalignment, modality imbalance, and limited supervision. Most existing multimodal methods assume relatively complete data, and even methods designed for incompleteness usually address only one or two of these issues in isolation. As a result, they often rely on rigid temporal/modal alignment or discard incomplete data, which may distort raw clinical semantics. To address this problem, we propose HealthPoint (HP), a unified clinical point cloud paradigm for multi-level incomplete EHRs. HP represents heterogeneous clinical events as points in a continuous 4D space defined by content, time, modality, and case. To model interactions between arbitrary point pairs, we introduce a Low-Rank Relational Attention mechanism that efficiently captures high-order dependencies across these four dimensions. We further develop a hierarchical interaction and sampling strategy to balance fine-grained modeling and computational efficiency. Built on this framework, HP enables flexible event-level interaction and fine-grained self-supervision, supporting robust modality recovery and effective use of unlabeled data. Experiments on large-scale EHR datasets for risk prediction show that HP consistently achieves state-of-the-art performance and strong robustness under varying degrees of incompleteness.