🤖 AI Summary
Traditional EHR models struggle to integrate clinical data with genetic risk—such as polygenic risk scores (PRS)—hindering accurate disease prediction and personalized care. To address this, we propose the first multimodal EHR foundation model that natively embeds PRS as an independent modality, trained on the All of Us cohort to enable deep genomic–clinical coupling. Methodologically, we extend multimodal deep learning by integrating generative AI and transfer learning, supporting interpretable analysis and fine-grained risk stratification. Experiments demonstrate significant performance gains over baselines in predicting key conditions—including type 2 diabetes (AUC improvement ≥0.04)—and highlight potential for real-world evidence generation and personalized health management. Our core contribution lies in the native multimodal representation of PRS and joint representation learning with EHR data, establishing a new paradigm for genetically informed clinical modeling.
📝 Abstract
This paper introduces an innovative Electronic Health Record (EHR) foundation model that integrates Polygenic Risk Scores (PRS) as a foundational data modality, moving beyond traditional EHR-only approaches to build more holistic health profiles. Leveraging the extensive and diverse data from the All of Us (AoU) Research Program, this multimodal framework aims to learn complex relationships between clinical data and genetic predispositions. The methodology extends advancements in generative AI to the EHR foundation model space, enhancing predictive capabilities and interpretability. Evaluation on AoU data demonstrates the model's predictive value for the onset of various conditions, particularly Type 2 Diabetes (T2D), and illustrates the interplay between PRS and EHR data. The work also explores transfer learning for custom classification tasks, showcasing the architecture's versatility and efficiency. This approach is pivotal for unlocking new insights into disease prediction, proactive health management, risk stratification, and personalized treatment strategies, laying the groundwork for more personalized, equitable, and actionable real-world evidence generation in healthcare.