🤖 AI Summary
To address the high computational and communication overhead of edge devices in federated learning, this paper proposes the first hybrid-precision federated learning framework that combines FP8 client-side training with a global FP32 server model. Methodologically, we design an edge-device–oriented 8-bit floating-point quantization training scheme, integrated with gradient/weight compression and theoretically grounded convergence guarantees to ensure compatibility between low-precision local updates and high-precision global aggregation. We provide rigorous convergence analysis proving that the framework converges under non-IID data distributions. Experiments across multiple models (ResNet, ViT) and benchmarks (CIFAR-10/100, FEMNIST) demonstrate that our approach reduces communication volume by at least 2.9× compared to the FP32 baseline, without sacrificing model accuracy—thereby significantly enhancing the efficiency of continual learning on resource-constrained edge devices.
📝 Abstract
Recent work has shown that 8-bit floating point (FP8) can be used for efficiently training neural networks with reduced computational overhead compared to training in FP32/FP16. In this work, we investigate the use of FP8 training in a federated learning context. This brings not only the usual benefits of FP8 which are desirable for on-device training at the edge, but also reduces client-server communication costs due to significant weight compression. We present a novel method for combining FP8 client training while maintaining a global FP32 server model and provide convergence analysis. Experiments with various machine learning models and datasets show that our method consistently yields communication reductions of at least 2.9x across a variety of tasks and models compared to an FP32 baseline.