🤖 AI Summary
To address model bias and slow convergence caused by non-IID data in federated learning, this paper proposes a hierarchical knowledge structuring framework. It introduces a novel unsupervised bottom-up clustering method that constructs multi-granularity codebooks in the logit space, effectively decoupling sample-level personalized representations from class-level global knowledge. Leveraging these codebooks, the framework performs multi-granularity knowledge distillation jointly optimized with supervision and generalization constraints, thereby synergistically enhancing both personalization and generalization. Extensive experiments across multiple benchmark datasets and model architectures demonstrate an average 5.2% improvement in local accuracy, a 31% acceleration in global convergence, and up to a 37% increase in knowledge-sharing efficiency. Moreover, the method exhibits significantly superior robustness and generalization performance compared to mainstream approaches including FedAvg and FedProx.
📝 Abstract
Federated learning enables collaborative model training across distributed entities while maintaining individual data privacy. A key challenge in federated learning is balancing the personalization of models for local clients with generalization for the global model. Recent efforts leverage logit-based knowledge aggregation and distillation to overcome these issues. However, due to the non-IID nature of data across diverse clients and the imbalance in the client's data distribution, directly aggregating the logits often produces biased knowledge that fails to apply to individual clients and obstructs the convergence of local training. To solve this issue, we propose a Hierarchical Knowledge Structuring (HKS) framework that formulates sample logits into a multi-granularity codebook to represent logits from personalized per-sample insights to globalized per-class knowledge. The unsupervised bottom-up clustering method is leveraged to enable the global server to provide multi-granularity responses to local clients. These responses allow local training to integrate supervised learning objectives with global generalization constraints, which results in more robust representations and improved knowledge sharing in subsequent training rounds. The proposed framework's effectiveness is validated across various benchmarks and model architectures.