Clustering-Based User Selection in Federated Learning: Metadata Exploitation for 3GPP Networks

📅 2026-01-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing federated learning simulations, which often rely on unrealistic data partitions and user selection strategies that neglect data correlations, thereby degrading model performance and convergence. To overcome these issues, the authors propose a metadata-aware federated learning framework that introduces a homogeneous Poisson point process (HPPP) to model more realistic non-IID data distributions. By leveraging metadata such as client location, the framework employs a clustering-based user selection mechanism to reduce inter-client data correlation and enhance label diversity. Experimental results on FMNIST and CIFAR-10 demonstrate that the proposed approach significantly improves model accuracy, stability, and convergence speed, with particularly pronounced gains in scenarios involving limited participant numbers.

Technology Category

Application Category

📝 Abstract
Federated learning (FL) enables collaborative model training without sharing raw user data, but conventional simulations often rely on unrealistic data partitioning and current user selection methods ignore data correlation among users. To address these challenges, this paper proposes a metadatadriven FL framework. We first introduce a novel data partition model based on a homogeneous Poisson point process (HPPP), capturing both heterogeneity in data quantity and natural overlap among user datasets. Building on this model, we develop a clustering-based user selection strategy that leverages metadata, such as user location, to reduce data correlation and enhance label diversity across training rounds. Extensive experiments on FMNIST and CIFAR-10 demonstrate that the proposed framework improves model performance, stability, and convergence in non-IID scenarios, while maintaining comparable performance under IID settings. Furthermore, the method shows pronounced advantages when the number of selected users per round is small. These findings highlight the framework's potential for enhancing FL performance in realistic deployments and guiding future standardization.
Problem

Research questions and friction points this paper is trying to address.

Federated Learning
User Selection
Data Correlation
Non-IID
Metadata
Innovation

Methods, ideas, or system contributions that make the work stand out.

clustering-based user selection
metadata-driven federated learning
homogeneous Poisson point process
non-IID data
label diversity
C
Ce Zheng
Department of Broadband Communication, Pengcheng Laboratory, Shenzhen, China
S
Shiyao Ma
College of Computer and Information Science, Southwest University, Chongqing, China
K
Ke Zhang
Intelligent Software Laboratory, Waseda University, Tokyo, Japan
Chen Sun
Chen Sun
Sony
knowledge distillationfederated learningwireless for AIdynamic spectrumV2X
Wenqi Zhang
Wenqi Zhang
Zhejiang University
Language ModelMultimodal LearningEmbodied Agents