🤖 AI Summary
Federated learning faces dual challenges of high communication overhead and accuracy degradation due to quantization errors. To address these, we propose FedBiF—a novel framework that integrates multi-bit parameter quantization *into* local training for the first time. In FedBiF, the server distributes quantized model parameters, and each client updates only *one critical bit* per round while freezing all others, enabling fine-grained, bit-level dynamic optimization. This in-training quantization avoids the error accumulation inherent in post-training quantization, inherently encourages model sparsity, and supports ultra-low-bandwidth communication (1 bpp uplink, 3 bpp downlink). Extensive experiments across five IID and non-IID benchmark datasets demonstrate that FedBiF achieves accuracy on par with FedAvg while drastically reducing communication volume—thus unifying efficient model compression with high-precision federated modeling.
📝 Abstract
Federated learning (FL) is an emerging distributed machine learning paradigm that enables collaborative model training without sharing local data. Despite its advantages, FL suffers from substantial communication overhead, which can affect training efficiency. Recent efforts have mitigated this issue by quantizing model updates to reduce communication costs. However, most existing methods apply quantization only after local training, introducing quantization errors into the trained parameters and potentially degrading model accuracy. In this paper, we propose Federated Bit Freezing (FedBiF), a novel FL framework that directly learns quantized model parameters during local training. In each communication round, the server first quantizes the model parameters and transmits them to the clients. FedBiF then allows each client to update only a single bit of the multi-bit parameter representation, freezing the remaining bits. This bit-by-bit update strategy reduces each parameter update to one bit while maintaining high precision in parameter representation. Extensive experiments are conducted on five widely used datasets under both IID and Non-IID settings. The results demonstrate that FedBiF not only achieves superior communication compression but also promotes sparsity in the resulting models. Notably, FedBiF attains accuracy comparable to FedAvg, even when using only 1 bit-per-parameter (bpp) for uplink and 3 bpp for downlink communication. The code is available at https://github.com/Leopold1423/fedbif-tpds25.