LLM-driven Medical Report Generation via Communication-efficient Heterogeneous Federated Learning

📅 2025-06-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Medical report generation (MRG) faces critical challenges under multi-center federated learning (FL), including privacy sensitivity, heterogeneous data modalities, and severe communication constraints. Method: We propose FedMRG—the first privacy-preserving FL framework for MRG—featuring (i) low-rank gradient decomposition to drastically reduce communication overhead; (ii) client-aware contrastive learning and diagnosis-driven prompt encoding to mitigate statistical heterogeneity; and (iii) a dual-adapter co-adaptive decoding mechanism to jointly model cross-center imaging feature discrepancies and report stylistic variations. Results: Evaluated on our newly established FL-MRG benchmark, FedMRG achieves state-of-the-art clinical accuracy, significantly improves cross-center generalization, reduces communication cost by 62%, and enables deployment in bandwidth-constrained real-world clinical environments.

Technology Category

Application Category

📝 Abstract
LLMs have demonstrated significant potential in Medical Report Generation (MRG), yet their development requires large amounts of medical image-report pairs, which are commonly scattered across multiple centers. Centralizing these data is exceptionally challenging due to privacy regulations, thereby impeding model development and broader adoption of LLM-driven MRG models. To address this challenge, we present FedMRG, the first framework that leverages Federated Learning (FL) to enable privacy-preserving, multi-center development of LLM-driven MRG models, specifically designed to overcome the critical challenge of communication-efficient LLM training under multi-modal data heterogeneity. To start with, our framework tackles the fundamental challenge of communication overhead in FL-LLM tuning by employing low-rank factorization to efficiently decompose parameter updates, significantly reducing gradient transmission costs and making LLM-driven MRG feasible in bandwidth-constrained FL settings. Furthermore, we observed the dual heterogeneity in MRG under the FL scenario: varying image characteristics across medical centers, as well as diverse reporting styles and terminology preferences. To address this, we further enhance FedMRG with (1) client-aware contrastive learning in the MRG encoder, coupled with diagnosis-driven prompts, which capture both globally generalizable and locally distinctive features while maintaining diagnostic accuracy; and (2) a dual-adapter mutual boosting mechanism in the MRG decoder that harmonizes generic and specialized adapters to address variations in reporting styles and terminology. Through extensive evaluation of our established FL-MRG benchmark, we demonstrate the generalizability and adaptability of FedMRG, underscoring its potential in harnessing multi-center data and generating clinically accurate reports while maintaining communication efficiency.
Problem

Research questions and friction points this paper is trying to address.

Privacy-preserving multi-center LLM training for medical reports
Reducing communication costs in federated learning for MRG
Addressing data and reporting heterogeneity across medical centers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Learning for privacy-preserving multi-center MRG
Low-rank factorization reduces communication overhead
Client-aware contrastive learning addresses data heterogeneity
🔎 Similar Papers
Haoxuan Che
Haoxuan Che
Hong Kong University of Science and Technology
Interactive Video GenerationModel Generalization
Haibo Jin
Haibo Jin
HKUST
Computer VisionMedical Image AnalysisVision-Language Modeling
Z
Zhengrui Guo
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong SAR, China
Y
Yi Lin
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong SAR, China
C
Cheng Jin
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong SAR, China
H
Hao Chen
Department of Computer Science and Engineering, Department of Chemical and Biological Engineering and Division of Life Science, Hong Kong University of Science and Technology, Hong Kong, China