🤖 AI Summary
This study investigates the zero-shot clinical event prediction capability of GPT-style generative pre-trained models on longitudinal electronic health record (EHR) data—without fine-tuning or labeled supervision. Methodologically, patient-specific temporal sequences of medical concepts are framed as a generative modeling task; next-event prediction is performed via zero-shot prompting and probabilistic decoding, leveraging an EHR-specific temporal concept encoding scheme. Key contributions include: (1) the first systematic validation of zero-shot prediction feasibility using EHR foundation models; and (2) introduction of a “prediction-as-generation” paradigm that eliminates reliance on task-specific fine-tuning and annotated labels. Experiments demonstrate strong performance: average top-1 accuracy of 0.614 and recall of 0.524 across diverse clinical events; notably, high true positive rates and low false positive rates are achieved for 12 critical diagnostic categories, confirming the model’s capacity to capture complex patient trajectories and long-range temporal dependencies.
📝 Abstract
Longitudinal data in electronic health records (EHRs) represent an individual`s clinical history through a sequence of codified concepts, including diagnoses, procedures, medications, and laboratory tests. Foundational models, such as generative pre-trained transformers (GPT), can leverage this data to predict future events. While fine-tuning of these models enhances task-specific performance, it is costly, complex, and unsustainable for every target. We show that a foundation model trained on EHRs can perform predictive tasks in a zero-shot manner, eliminating the need for fine-tuning. This study presents the first comprehensive analysis of zero-shot forecasting with GPT-based foundational models in EHRs, introducing a novel pipeline that formulates medical concept prediction as a generative modeling task. Unlike supervised approaches requiring extensive labeled data, our method enables the model to forecast a next medical event purely from a pretraining knowledge. We evaluate performance across multiple time horizons and clinical categories, demonstrating model`s ability to capture latent temporal dependencies and complex patient trajectories without task supervision. Model performance for predicting the next medical concept was evaluated using precision and recall metrics, achieving an average top1 precision of 0.614 and recall of 0.524. For 12 major diagnostic conditions, the model demonstrated strong zero-shot performance, achieving high true positive rates while maintaining low false positives. We demonstrate the power of a foundational EHR GPT model in capturing diverse phenotypes and enabling robust, zero-shot forecasting of clinical outcomes. This capability enhances the versatility of predictive healthcare models and reduces the need for task-specific training, enabling more scalable applications in clinical settings.