๐ค AI Summary
Multi-center fMRI data exhibit pronounced non-IID characteristics due to privacy constraints and site-specific heterogeneity, severely undermining model generalizability. To address this, we propose Personalized Federated Dictionary Learning (PF-DL): it decomposes each siteโs dictionary into shared global atoms and site-specific local atoms. Global atoms are collaboratively updated via federated aggregation to enhance cross-site consistency, while local atoms preserve site-specific neurobiological variability. PF-DL operates without raw data sharing, ensuring privacy-preserving distributed learning and enabling interpretable feature extraction. Experiments on the ABIDE multi-center dataset demonstrate that PF-DL improves classification accuracy by 3.2โ5.7% under non-IID conditions and significantly outperforms existing federated learning and dictionary learning methods in robustness. This work establishes a novel paradigm for privacy-aware, interpretable multi-center neuroimaging analysis.
๐ Abstract
Data privacy constraints pose significant challenges for large-scale neuroimaging analysis, especially in multi-site functional magnetic resonance imaging (fMRI) studies, where site-specific heterogeneity leads to non-independent and identically distributed (non-IID) data. These factors hinder the development of generalizable models. To address these challenges, we propose Personalized Federated Dictionary Learning (PFedDL), a novel federated learning framework that enables collaborative modeling across sites without sharing raw data. PFedDL performs independent dictionary learning at each site, decomposing each site-specific dictionary into a shared global component and a personalized local component. The global atoms are updated via federated aggregation to promote cross-site consistency, while the local atoms are refined independently to capture site-specific variability, thereby enhancing downstream analysis. Experiments on the ABIDE dataset demonstrate that PFedDL outperforms existing methods in accuracy and robustness across non-IID datasets.