A Federated and Parameter-Efficient Framework for Large Language Model Training in Medicine

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited generalization and safety of large medical language models trained on single-institution data, as well as the challenges posed by high communication costs and clinical data heterogeneity in conventional federated learning. To overcome these issues, the authors propose Fed-MedLoRA and its enhanced variant Fed-MedLoRA+, a model-agnostic, parameter-efficient federated learning framework that transmits only low-rank adapter (LoRA) parameters. The approach incorporates a data-aware adaptive aggregation mechanism to mitigate cross-institutional heterogeneity. Experimental results across five patient cohorts demonstrate that the proposed method substantially reduces communication and computational overhead while consistently outperforming baseline models—including BERT, LLaMA-3, and DeepSeek-R1—in in-domain evaluation, external validation, and adaptation to low-resource new sites.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have demonstrated strong performance on medical benchmarks, including question answering and diagnosis. To enable their use in clinical settings, LLMs are typically further adapted through continued pretraining or post-training using clinical data. However, most medical LLMs are trained on data from a single institution, which faces limitations in generalizability and safety in heterogeneous systems. Federated learning (FL) is a promising solution for enabling collaborative model development across healthcare institutions. Yet applying FL to LLMs in medicine remains fundamentally limited. First, conventional FL requires transmitting the full model during each communication round, which becomes impractical for multi-billion-parameter LLMs given the limited computational resources. Second, many FL algorithms implicitly assume data homogeneity, whereas real-world clinical data are highly heterogeneous across patients, diseases, and institutional practices. We introduce the model-agnostic and parameter-efficient federated learning framework for adapting LLMs to medical applications. Fed-MedLoRA transmits only low-rank adapter parameters, reducing communication and computation overhead, while Fed-MedLoRA+ further incorporates adaptive, data-aware aggregation to improve convergence under cross-site heterogeneity. We apply the framework to clinical information extraction (IE), which transforms patient narratives into structured medical entities and relations. Accuracy was assessed across five patient cohorts through comparisons with BERT models, and LLaMA-3 and DeepSeek-R1, GPT-4o models. Evaluation settings included (1) in-domain training and testing, (2) external validation on independent cohorts, and (3) a low-resource new-site adaptation scenario using real-world clinical notes from the Yale New Haven Health System.
Problem

Research questions and friction points this paper is trying to address.

federated learning
large language models
medical data heterogeneity
parameter efficiency
clinical information extraction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Learning
Parameter-Efficient Fine-Tuning
Medical Large Language Models
Low-Rank Adaptation
Data Heterogeneity
🔎 Similar Papers
No similar papers found.
Anran Li
Anran Li
Yale University
Trustworthy AImedical LLMsfederated learning
Yuanyuan Chen
Yuanyuan Chen
Nanyang Technological University
W
Wenjun Long
College of Computing and Data Science, Nanyang Technological University, Singapore
Y
Yu Yin
Department of Earth Science and Engineering, Imperial College London, London, United Kingdom
Yan Hu
Yan Hu
University of Texas Health Science Center at Houston
Natural Language Processing
Hyunjae Kim
Hyunjae Kim
Yale University
Natural Language ProcessingBiomedical InformaticsHealthcare
W
Weipeng Zhou
Department of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, USA
Yujia Zhou
Yujia Zhou
Yale University
H
Hongyi Peng
College of Computing and Data Science, Nanyang Technological University, Singapore
Y
Yang Ren
Department of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, USA
Xuguang Ai
Xuguang Ai
Biomedical Informatics & Data Science, Yale University
AI in HealthcareData ScienceNLPBiomedical Informatics
Z
Zhenyue Qin
Department of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, USA
M
Ming Hu
School of Computing and Information Systems, Singapore Management University, Singapore
Xiaoxiao Li
Xiaoxiao Li
Assistant Professor, UBC; Vector Institute; CIFAR AI Chair; Canada Research Chair
Deep LearningTrustworthy AIAI for Healthcare
Han Yu
Han Yu
Associate Professor, CCDS, Nanyang Technological University, Singapore
Federated LearningCollaborative LearningTrustworthy Machine LearningAI Ethics
Y
Y. Tham
Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
Lucila Ohno-Machado
Lucila Ohno-Machado
University of California San Diego
Biomedical InformaticsPredictive Modeling
Hua Xu
Hua Xu
Robert T. McCluskey Professor, Section of Biomedical Informatics and Data Science, Yale University
natural language processingtext mining
Qingyu Chen
Qingyu Chen
Biomedical Informatics & Data Science, Yale University; NCBI-NLM, National Institutes of Health
Text miningMachine learningData curationBioNLPMedical Imaging Analysis