🤖 AI Summary
This work addresses the challenge of effectively integrating large language models (LLMs) into longitudinal clinical reasoning while preserving the temporal structure and semantic meaning of electronic health records (EHRs), and compensating for LLMs’ lack of population-level patterns when applied to isolated cases. The authors propose a time-aware recursive prompt tuning framework that integrates a structured EHR encoder with an LLM without modifying the LLM’s architecture. By recursively aggregating hidden states from historical visits and injecting population-level representations learned from cohort training, the method uniquely combines recursive temporal modeling with trainable, population-aligned prompts to jointly capture individual patient dynamics and cross-patient shared knowledge. Experiments on MIMIC-III and MIMIC-IV demonstrate that this approach significantly outperforms existing EHR-specific models and LLM baselines across multiple clinical prediction tasks.
📝 Abstract
Large Language Models (LLMs) have shown strong promise for mining Electronic Health Records (EHRs) by reasoning over longitudinal clinical information to capture context-rich patient trajectories. However, leveraging LLMs for structured EHRs (e.g., standardized diagnosis and medication codes) presents two key challenges. First, translating time-stamped EHR sequences into plain text can obscure both temporal structure and code identities, weakening the ability to capture code co-occurrence and longitudinal regularities. Second, unlike cohort-trained predictive models that learn a shared, task-aligned representation space across patients, LLMs are often applied in a case-isolated inference setting where each patient is processed independently without leveraging population-level patterns. To address these challenges, we introduce RePrompT, a time-aware LLM framework that integrates structured EHR encoders through prompt tuning, without modifying underlying architectures. Specifically, RePrompT recurrently incorporates latent states from prior visits to preserve longitudinal information, and injects population-level information through trainable prompt tokens derived from a cohort-trained, task-aligned EHR encoder. Experiments on MIMIC-III and MIMIC-IV demonstrate that RePrompT consistently outperforms both EHR-based and LLM-based baselines across multiple clinical prediction tasks.