🤖 AI Summary
To address the challenges of high cross-institutional EHR heterogeneity, substantial communication overhead, and difficulty in client-specific model customization in medical federated learning, this paper proposes EHRFL—a novel client-centric federated learning framework. EHRFL introduces textual EHR representation, patient-level embedding aggregation, and heterogeneous data alignment to enable institution-specific model training. It further designs a lightweight client selection mechanism based on average patient embeddings, significantly improving participant matching efficiency while preserving privacy and regulatory compliance. Extensive experiments on multiple open-source EHR datasets demonstrate that EHRFL achieves comparable or superior performance to the global model using only 30%–50% of participating institutions. Moreover, it substantially reduces both communication and computational overhead, enabling scalable and practical deployment in real-world healthcare federations.
📝 Abstract
The increasing volume of electronic health records (EHRs) presents the opportunity to improve the accuracy and robustness of models in clinical prediction tasks. Unlike traditional centralized approaches, federated learning enables training on data from multiple institutions while preserving patient privacy and complying with regulatory constraints. However, most federated learning research focuses on building a global model to serve multiple clients, overlooking the practical need for a client-specific model. In this work, we introduce EHRFL, a federated learning framework using EHRs, designed to develop a model tailored to a single client (i.e., healthcare institution). Our framework addresses two key challenges: (1) enabling federated learning across clients with heterogeneous EHR systems using text-based EHR modeling, and (2) reducing the cost of federated learning by selecting suitable participating clients using averaged patient embeddings. Our experiment results on multiple open-source EHR datasets demonstrate the effectiveness of EHRFL in addressing the two challenges, establishing it as a practical solution for building a client-specific model in federated learning.