🤖 AI Summary
Federated learning (FL) faces two major challenges: high communication overhead and poor generalization due to non-independent and identically distributed (Non-IID) client data. To address these, we propose a prototype-guided lightweight adapter framework. It mitigates statistical heterogeneity by aligning local feature representations via globally shared class prototypes, while replacing full-model transmission with low-rank adapters to drastically reduce communication costs. The method inherently offers interpretability—prototypes serve as semantic class centers—and enables efficient collaborative training. Experiments on a real-world retinal fundus image dataset demonstrate that our approach outperforms mainstream baselines in classification accuracy. Moreover, it supports prototype-based model behavior attribution analysis, enabling transparent diagnosis of model decisions. This work establishes a new paradigm for medical FL that jointly optimizes performance, communication efficiency, and interpretability.
📝 Abstract
Federated learning (FL) provides a promising paradigm for collaboratively training machine learning models across distributed data sources while maintaining privacy. Nevertheless, real-world FL often faces major challenges including communication overhead during the transfer of large model parameters and statistical heterogeneity, arising from non-identical independent data distributions across clients. In this work, we propose an FL framework that 1) provides inherent interpretations using prototypes, and 2) tackles statistical heterogeneity by utilising lightweight adapter modules to act as compressed surrogates of local models and guide clients to achieve generalisation despite varying client distribution. Each client locally refines its model by aligning class embeddings toward prototype representations and simultaneously adjust the lightweight adapter. Our approach replaces the need to communicate entire model weights with prototypes and lightweight adapters. This design ensures that each client's model aligns with a globally shared structure while minimising communication load and providing inherent interpretations. Moreover, we conducted our experiments on a real-world retinal fundus image dataset, which provides clinical-site information. We demonstrate inherent interpretable capabilities and perform a classification task, which shows improvements in accuracy over baseline algorithms.