π€ AI Summary
This work addresses the performance degradation in federated learning caused by client data heterogeneity by proposing FedGMI, a framework that effectively balances personalization and generalization. FedGMI introduces, for the first time, a probabilistic mixture modeling paradigm, representing each clientβs data distribution as a convex combination of multiple shared latent distributions. A variational autoencoder (VAE) is employed as a generative density estimator to jointly infer both the mixture components and the shared distributions. Experimental results demonstrate that FedGMI accurately identifies intrinsic data distributions, estimates mixture weights, and maintains robust performance under communication constraints, significantly enhancing personalized model effectiveness and collaborative learning efficiency.
π Abstract
Federated Learning (FL) facilitates collaborative model training across decentralized clients while preserving data privacy by avoiding raw data exchange. Despite its potential, FL performance is often compromised by data heterogeneity across clients. To address this, Clustered Federated Learning (CFL) groups clients with similar data distributions to improve model performance, but constrained by intra-cluster heterogeneity. Conversely, Personalized Federated Learning (PFL) tailors models to individual clients, but usually neglects the underlying structural similarities among clients. In this work, we investigate a probabilistic mixture (PM) scenario, where each client's local data distribution is modeled as a convex combination of several shared inherent distributions. To effectively model this structure, we propose FedGMI, a framework that utilizes Variational Autoencoders (VAEs) as generative density estimators to represent these inherent distributions and infer the mixture components of clients' local data distributions. This approach enables structured personalization without sacrificing the benefits of collaborative learning. Extensive experiments demonstrate that FedGMI effectively characterizes and discriminate the inherent distributions, as well as accurately estimates mixture proportions. Furthermore, FedGMI maintains robust performance even under communication cost constraints.