🤖 AI Summary
This work addresses the limitations of existing federated learning simulations, which often rely on unrealistic data partitions and user selection strategies that neglect data correlations, thereby degrading model performance and convergence. To overcome these issues, the authors propose a metadata-aware federated learning framework that introduces a homogeneous Poisson point process (HPPP) to model more realistic non-IID data distributions. By leveraging metadata such as client location, the framework employs a clustering-based user selection mechanism to reduce inter-client data correlation and enhance label diversity. Experimental results on FMNIST and CIFAR-10 demonstrate that the proposed approach significantly improves model accuracy, stability, and convergence speed, with particularly pronounced gains in scenarios involving limited participant numbers.
📝 Abstract
Federated learning (FL) enables collaborative model training without sharing raw user data, but conventional simulations often rely on unrealistic data partitioning and current user selection methods ignore data correlation among users. To address these challenges, this paper proposes a metadatadriven FL framework. We first introduce a novel data partition model based on a homogeneous Poisson point process (HPPP), capturing both heterogeneity in data quantity and natural overlap among user datasets. Building on this model, we develop a clustering-based user selection strategy that leverages metadata, such as user location, to reduce data correlation and enhance label diversity across training rounds. Extensive experiments on FMNIST and CIFAR-10 demonstrate that the proposed framework improves model performance, stability, and convergence in non-IID scenarios, while maintaining comparable performance under IID settings. Furthermore, the method shows pronounced advantages when the number of selected users per round is small. These findings highlight the framework's potential for enhancing FL performance in realistic deployments and guiding future standardization.