You Are Your Own Best Teacher: Achieving Centralized-level Performance in Federated Learning under Heterogeneous and Long-tailed Data

📅 2025-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In federated learning, the compounded effect of local non-IID data and global long-tailed class distributions severely degrades model performance, falling short of centralized training. To address this, we propose a self-guided collaborative optimization framework featuring two novel components: Augmented Self-Distillation (ASD) and Distribution-Aware Logit Adjustment (DLA). Without requiring auxiliary data or additional models, our approach approximates the neural collapse-optimal representation. It integrates contrastive self-distillation–driven self-supervised representation learning, neural collapse–guided prototype optimization, and joint long-tail modeling with logit calibration. Evaluated across multiple benchmarks under global long-tailed settings, our method achieves state-of-the-art performance—improving over centralized logit adjustment by 5.4% in top-1 accuracy. Learned feature prototypes exhibit significantly higher alignment with neural collapse optima, while model drift is markedly suppressed and convergence accelerated.

Technology Category

Application Category

📝 Abstract
Data heterogeneity, stemming from local non-IID data and global long-tailed distributions, is a major challenge in federated learning (FL), leading to significant performance gaps compared to centralized learning. Previous research found that poor representations and biased classifiers are the main problems and proposed neural-collapse-inspired synthetic simplex ETF to help representations be closer to neural collapse optima. However, we find that the neural-collapse-inspired methods are not strong enough to reach neural collapse and still have huge gaps to centralized training. In this paper, we rethink this issue from a self-bootstrap perspective and propose FedYoYo (You Are Your Own Best Teacher), introducing Augmented Self-bootstrap Distillation (ASD) to improve representation learning by distilling knowledge between weakly and strongly augmented local samples, without needing extra datasets or models. We further introduce Distribution-aware Logit Adjustment (DLA) to balance the self-bootstrap process and correct biased feature representations. FedYoYo nearly eliminates the performance gap, achieving centralized-level performance even under mixed heterogeneity. It enhances local representation learning, reducing model drift and improving convergence, with feature prototypes closer to neural collapse optimality. Extensive experiments show FedYoYo achieves state-of-the-art results, even surpassing centralized logit adjustment methods by 5.4% under global long-tailed settings.
Problem

Research questions and friction points this paper is trying to address.

Addresses performance gaps in federated learning due to data heterogeneity.
Proposes FedYoYo to improve representation learning without extra datasets.
Achieves centralized-level performance under heterogeneous and long-tailed data.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Augmented Self-bootstrap Distillation for representation learning
Distribution-aware Logit Adjustment for bias correction
FedYoYo achieves centralized-level performance in FL
🔎 Similar Papers
No similar papers found.
S
Shanshan Yan
Xiamen University
Zexi Li
Zexi Li
Alibaba Group
Deep LearningLarge Language ModelsFederated Learning
C
Chao Wu
Zhejiang University
M
Meng Pang
Nanchang University
Y
Yang Lu
Xiamen University
Y
Yan Yan
Xiamen University
Hanzi Wang
Hanzi Wang
Professor of Xiamen University
Computer VisionPattern RecognitionModel FittingVisual Tracking,Object Detection and Recognition