🤖 AI Summary
In resource-constrained heterogeneous federated learning (FL), clients employing mixed quantization levels suffer from model shift, aggregation bias, and client drift. To address these issues, this paper proposes a weight-shift-based statistical alignment mechanism. The method introduces a lightweight, retraining-free weight shifting strategy with zero additional communication overhead: the server aligns the output distributions of locally trained models—quantized at varying bit-widths—by matching their first- and second-order statistics. Crucially, the approach is compatible with mainstream FL optimizers (e.g., FedAvg, FedOpt) without architectural or protocol modifications. Extensive experiments on benchmark datasets (CIFAR-10/100, Tiny-ImageNet, FEMNIST) demonstrate significantly improved convergence speed and final accuracy, effectively eliminating performance degradation induced by mixed quantization. The solution offers a scalable, low-overhead, and practical framework for efficient FL on heterogeneous edge devices.
📝 Abstract
Federated Learning (FL) commonly relies on a central server to coordinate training across distributed clients. While effective, this paradigm suffers from significant communication overhead, impacting overall training efficiency. To mitigate this, prior work has explored compression techniques such as quantization. However, in heterogeneous FL settings, clients may employ different quantization levels based on their hardware or network constraints, necessitating a mixed-precision aggregation process at the server. This introduces additional challenges, exacerbating client drift and leading to performance degradation. In this work, we propose FedShift, a novel aggregation methodology designed to mitigate performance degradation in FL scenarios with mixed quantization levels. FedShift employs a statistical matching mechanism based on weight shifting to align mixed-precision models, thereby reducing model divergence and addressing quantization-induced bias. Our approach functions as an add-on to existing FL optimization algorithms, enhancing their robustness and improving convergence. Empirical results demonstrate that FedShift effectively mitigates the negative impact of mixed-precision aggregation, yielding superior performance across various FL benchmarks.