Robust Clustered Federated Learning for Heterogeneous High-dimensional Data

📅 2025-10-12

📈 Citations: 0

✨ Influential: 0

career value

238K/year

🤖 AI Summary

This paper addresses the dual challenges in federated learning: the coexistence of subgroup structure and intra-group heterogeneity, coupled with high-dimensional heavy-tailed data. Methodologically, we propose an adaptive clustering federated learning framework that jointly models subgroup partitioning and sparse parameter estimation, integrating Huber loss with iterative hard thresholding (IHT) compression within a grouped federated architecture to simultaneously capture inter-group discrepancies and facilitate intra-group knowledge sharing. Theoretically, we establish the first non-asymptotic error bound and provide recoverability guarantees for the underlying clustering structure. Empirically, our method significantly improves convergence speed, parameter estimation accuracy, and clustering fidelity on both synthetic and real-world datasets, while maintaining robustness to heavy-tailed noise and computational efficiency.

Technology Category

Application Category

📝 Abstract

Federated learning has attracted significant attention as a privacy-preserving framework for training personalised models on multi-source heterogeneous data. However, most existing approaches are unable to handle scenarios where subgroup structures coexist alongside within-group heterogeneity. In this paper, we propose a federated learning algorithm that addresses general heterogeneity through adaptive clustering. Specifically, our method partitions tasks into subgroups to address substantial between-group differences while enabling efficient information sharing among similar tasks within each group. Furthermore, we integrate the Huber loss and Iterative Hard Thresholding (IHT) to tackle the challenges of high dimensionality and heavy-tailed distributions. Theoretically, we establish convergence guarantees, derive non-asymptotic error bounds, and provide recovery guarantees for the latent cluster structure. Extensive simulation studies and real-data applications further demonstrate the effectiveness and adaptability of our approach.

Problem

Research questions and friction points this paper is trying to address.

Handles subgroup structures with within-group heterogeneity

Addresses high dimensionality and heavy-tailed data distributions

Enables efficient information sharing among similar tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive clustering partitions tasks into subgroups

Huber loss and IHT handle high-dimensional data

Enables information sharing among similar tasks

🔎 Similar Papers

No similar papers found.