Large Language Models are Powerful EHR Encoders

📅 2025-02-24
📈 Citations: 0
✹ Influential: 0
📄 PDF
đŸ€– AI Summary
Electronic health record (EHR) data exhibit high heterogeneity, and existing specialized models rely heavily on private, institution-specific medical datasets, limiting generalizability and scalability. Method: We propose a novel paradigm leveraging general-purpose large language models (LLMs) as lightweight encoders. Patient records are serialized into structured Markdown text, and clinical codes are mapped to semantically rich natural language descriptions. This enables zero-shot or few-shot clinical prediction by harnessing LLMs’ broad generalization capabilities acquired from public corpora. Contribution/Results: We conduct the first systematic evaluation on the EHRSHOT benchmark across 15 tasks, demonstrating that general LLM embedding models—e.g., GTE-Qwen2-7B and LLMS2Vec-Llama3.1-8B—match or surpass specialized foundation models such as CLIMBR-T-Base, particularly under few-shot settings with superior robustness. We further identify positive correlations between LLM parameter count, context length, and predictive performance. Our findings establish that general LLMs can effectively replace domain-specific EHR encoders, significantly improving cross-institutional generalizability and deployment scalability.

Technology Category

Application Category

📝 Abstract
Electronic Health Records (EHRs) offer rich potential for clinical prediction, yet their inherent complexity and heterogeneity pose significant challenges for traditional machine learning approaches. Domain-specific EHR foundation models trained on large collections of unlabeled EHR data have demonstrated promising improvements in predictive accuracy and generalization; however, their training is constrained by limited access to diverse, high-quality datasets and inconsistencies in coding standards and healthcare practices. In this study, we explore the possibility of using general-purpose Large Language Models (LLMs) based embedding methods as EHR encoders. By serializing patient records into structured Markdown text, transforming codes into human-readable descriptors, we leverage the extensive generalization capabilities of LLMs pretrained on vast public corpora, thereby bypassing the need for proprietary medical datasets. We systematically evaluate two state-of-the-art LLM-embedding models, GTE-Qwen2-7B-Instruct and LLM2Vec-Llama3.1-8B-Instruct, across 15 diverse clinical prediction tasks from the EHRSHOT benchmark, comparing their performance to an EHRspecific foundation model, CLIMBR-T-Base, and traditional machine learning baselines. Our results demonstrate that LLM-based embeddings frequently match or exceed the performance of specialized models, even in few-shot settings, and that their effectiveness scales with the size of the underlying LLM and the available context window. Overall, our findings demonstrate that repurposing LLMs for EHR encoding offers a scalable and effective approach for clinical prediction, capable of overcoming the limitations of traditional EHR modeling and facilitating more interoperable and generalizable healthcare applications.
Problem

Research questions and friction points this paper is trying to address.

LLMs as EHR encoders
Overcoming EHR complexity
Improving clinical prediction accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs as EHR encoders
Serializing records into Markdown
Leveraging pretrained LLMs capabilities
🔎 Similar Papers
No similar papers found.
Stefan Hegselmann
Stefan Hegselmann
Research associate, Berlin Institute of Health at Charité
MedicineMachine LearningNLPLLMMedical Informatics
G
Georg von Arnim
Center for Digital Health, Berlin Institute of Health (BIH), Charité - University Medicine Berlin, Berlin, Germany; German Heart Center of the Charité, Berlin, Germany
T
Tillmann Rheude
Center for Digital Health, Berlin Institute of Health (BIH), Charité - University Medicine Berlin, Berlin, Germany
N
Noel Kronenberg
Center for Digital Health, Berlin Institute of Health (BIH), Charité - University Medicine Berlin, Berlin, Germany
David Sontag
David Sontag
Professor, Massachusetts Institute of Technology
Machine LearningHealthcareArtificial IntelligenceLarge Language ModelsApproximate Inference
G
Gerhard Hindricks
German Heart Center of the Charité, Berlin, Germany
Roland Eils
Roland Eils
Professor for Digital Health, Charité-UniversitÀtsmedizin Berlin and Berlin Institute of Health
Digital HealthSystems BiologyCancer ResearchMedical InformaticsData Sciences
Benjamin Wild
Benjamin Wild
Group Leader, Berlin Institute of Health
Machine LearningHealthcareUnsupervised LearningSocial networksApis mellifera