From EHRs to Patient Pathways: Scalable Modeling of Longitudinal Health Trajectories with LLMs

📅 2025-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Integrating heterogeneous, multi-source electronic health records (EHR) to model long-term patient health trajectories remains a major challenge in healthcare systems. This paper introduces EHR2Path, the first framework that transforms raw EHR data into structured, interpretable patient pathway representations, enabling long-horizon modeling and multi-step longitudinal simulation. Its key contributions are: (1) a topic-aware, long-term temporal summarization mechanism that captures global contextual dependencies with high token efficiency; and (2) a unified architecture integrating large language model–based sequence modeling, temporal structural encoding, topic-driven dynamic summary token generation, and multimodal EHR alignment. Evaluated on next-time-step prediction and multi-week longitudinal simulation tasks, EHR2Path consistently outperforms state-of-the-art baselines across diverse clinical outcomes—including vital signs, laboratory test results, and length of hospital stay—with up to a 12.3% improvement in AUC.

Technology Category

Application Category

📝 Abstract
Healthcare systems face significant challenges in managing and interpreting vast, heterogeneous patient data for personalized care. Existing approaches often focus on narrow use cases with a limited feature space, overlooking the complex, longitudinal interactions needed for a holistic understanding of patient health. In this work, we propose a novel approach to patient pathway modeling by transforming diverse electronic health record (EHR) data into a structured representation and designing a holistic pathway prediction model, EHR2Path, optimized to predict future health trajectories. Further, we introduce a novel summary mechanism that embeds long-term temporal context into topic-specific summary tokens, improving performance over text-only models, while being much more token-efficient. EHR2Path demonstrates strong performance in both next time-step prediction and longitudinal simulation, outperforming competitive baselines. It enables detailed simulations of patient trajectories, inherently targeting diverse evaluation tasks, such as forecasting vital signs, lab test results, or length-of-stay, opening a path towards predictive and personalized healthcare.
Problem

Research questions and friction points this paper is trying to address.

Modeling longitudinal health trajectories from diverse EHR data
Predicting future patient pathways with structured representations
Improving efficiency and accuracy in health trajectory simulations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transforms EHR data into structured representations
Uses topic-specific summary tokens for efficiency
Predicts future health trajectories holistically
🔎 Similar Papers
No similar papers found.