🤖 AI Summary
Existing federated learning (FL) algorithms suffer significant performance degradation under spatially induced non-IID data—such as regional dialects or localized traffic patterns—yet mainstream FL evaluation relies predominantly on random non-IID partitioning, neglecting inherent geographic structure. To address this gap, we propose GeoFL, the first FL benchmark explicitly designed for geography-driven non-IID settings. GeoFL innovatively incorporates spatial distance modeling into FL benchmark design, enabling multi-granularity simulation of regional data skew and standardized evaluation across diverse datasets—from MNIST to CIFAR-100. Building upon pathological and quantity-based partitioning schemes, it extends spatial clustering strategies and provides a fully reproducible data partitioning toolkit. Empirical validation demonstrates that GeoFL substantially enhances the comparability and generalizability assessment of FL algorithms in real-world edge urban computing scenarios.
📝 Abstract
In recent years, cro:flFederated learning (FL) has gained significant attention within the machine learning community. Although various FL algorithms have been proposed in the literature, their performance often degrades when data across clients is non-independently and identically distributed (non-IID). This skewness in data distribution often emerges from geographic patterns, with notable examples including regional linguistic variations in text data or localized traffic patterns in urban environments. Such scenarios result in IID data within specific regions but non-IID data across regions. However, existing FL algorithms are typically evaluated by randomly splitting non-IID data across devices, disregarding their spatial distribution. To address this gap, we introduce ProFed, a benchmark that simulates data splits with varying degrees of skewness across different regions. We incorporate several skewness methods from the literature and apply them to well-known datasets, including MNIST, FashionMNIST, CIFAR-10, and CIFAR-100. Our goal is to provide researchers with a standardized framework to evaluate FL algorithms more effectively and consistently against established baselines.