🤖 AI Summary
Existing EHR prediction models struggle to effectively model cross-modal interactions, redundancy, and temporal dynamics among unstructured clinical text, structured laboratory test results, and longitudinal visit sequences—limiting performance in multi-label chronic disease prediction. To address this, we propose the first unified multimodal representation framework: it employs large language models to encode clinical notes and textual lab reports, leverages Transformers to model temporal visit sequences, and introduces a dedicated cross-modal fusion mechanism to jointly learn from these three heterogeneous data modalities. Our framework enables end-to-end joint modeling of clinical text, structured clinical indicators, and longitudinal patient trajectories—significantly enhancing cross-modal interaction learning and temporal pattern capture. Evaluated on MIMIC-III and FEMH datasets, the framework achieves >94% top-10 multi-label prediction accuracy for chronic diseases, demonstrating both superior performance and strong generalizability.
📝 Abstract
Electronic health records (EHRs) are designed to synthesize diverse data types, including unstructured clinical notes, structured lab tests, and time-series visit data. Physicians draw on these multimodal and temporal sources of EHR data to form a comprehensive view of a patient's health, which is crucial for informed therapeutic decision-making. Yet, most predictive models fail to fully capture the interactions, redundancies, and temporal patterns across multiple data modalities, often focusing on a single data type or overlooking these complexities. In this paper, we present CURENet, a multimodal model (Combining Unified Representations for Efficient chronic disease prediction) that integrates unstructured clinical notes, lab tests, and patients'time-series data by utilizing large language models (LLMs) for clinical text processing and textual lab tests, as well as transformer encoders for longitudinal sequential visits. CURENet has been capable of capturing the intricate interaction between different forms of clinical data and creating a more reliable predictive model for chronic illnesses. We evaluated CURENet using the public MIMIC-III and private FEMH datasets, where it achieved over 94% accuracy in predicting the top 10 chronic conditions in a multi-label framework. Our findings highlight the potential of multimodal EHR integration to enhance clinical decision-making and improve patient outcomes.