Collaborative Adaptive Curriculum for Progressive Knowledge Distillation

📅 2026-03-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes Federated Adaptive Progressive Distillation (FAPD), a novel framework addressing the mismatch between high-dimensional teacher models and the heterogeneous learning capabilities of edge clients in federated learning. FAPD introduces adaptive curriculum learning into federated knowledge distillation for the first time, constructing a knowledge hierarchy by hierarchically decomposing teacher features via PCA and dynamically regulating the complexity and pacing of knowledge transfer through a temporal consensus window. By leveraging dimension-adaptive projection matrices and a global consensus mechanism, FAPD accommodates device heterogeneity while enhancing convergence efficiency. Experiments demonstrate that FAPD improves accuracy by 3.64% over FedAvg on CIFAR-10 and achieves a 2× faster convergence rate; under extreme non-IID settings (α=0.1), it further outperforms FedAvg by over 4.5%.

Technology Category

Application Category

📝 Abstract
Recent advances in collaborative knowledge distillation have demonstrated cutting-edge performance for resource-constrained distributed multimedia learning scenarios. However, achieving such competitiveness requires addressing a fundamental mismatch: high-dimensional teacher knowledge complexity versus heterogeneous client learning capacities, which currently prohibits deployment in edge-based visual analytics systems. Drawing inspiration from curriculum learning principles, we introduce Federated Adaptive Progressive Distillation (FAPD), a consensus-driven framework that orchestrates adaptive knowledge transfer. FAPD hierarchically decomposes teacher features via PCA-based structuring, extracting principal components ordered by variance contribution to establish a natural visual knowledge hierarchy. Clients progressively receive knowledge of increasing complexity through dimension-adaptive projection matrices. Meanwhile, the server monitors network-wide learning stability by tracking global accuracy fluctuations across a temporal consensus window, advancing curriculum dimensionality only when collective consensus emerges. Consequently, FAPD provably adapts knowledge transfer pace while achieving superior convergence over fixed-complexity approaches. Extensive experiments on three datasets validate FAPD's effectiveness: it attains 3.64% accuracy improvement over FedAvg on CIFAR-10, demonstrates 2x faster convergence, and maintains robust performance under extreme data heterogeneity (α=0.1), outperforming baselines by over 4.5%.
Problem

Research questions and friction points this paper is trying to address.

collaborative knowledge distillation
heterogeneous client capacities
edge visual analytics
knowledge complexity mismatch
distributed learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Learning
Knowledge Distillation
Curriculum Learning
PCA-based Structuring
Adaptive Projection
🔎 Similar Papers
No similar papers found.
Jing Liu
Jing Liu
University of British Columbia / Fudan University
anomaly detectionedge-cloud collaborationdomain generalizationactivity recognition
Z
Zhenchao Ma
The University of British Columbia
H
Han Yu
Fudan University
B
Bobo Ju
Shanghai Shentong Metro Co., Ltd.
W
Wenliang Yang
Fudan University
C
Chengfang Li
Suzhou Inst. of Biomed. Eng. and Tech., Chinese Academy of Sciences
Bo Hu
Bo Hu
Fudan University
signal processingcommunicationimage processing
Liang Song
Liang Song
Fudan University, University of Toronto (Adjunct)
AICommunicationsNetworkingSystem ControlSignal Processing