Towards Federated Learning with On-device Training and Communication in 8-bit Floating Point

📅 2024-07-02

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

192K/year

🤖 AI Summary

To address the high computational and communication overhead of edge devices in federated learning, this paper proposes the first hybrid-precision federated learning framework that combines FP8 client-side training with a global FP32 server model. Methodologically, we design an edge-device–oriented 8-bit floating-point quantization training scheme, integrated with gradient/weight compression and theoretically grounded convergence guarantees to ensure compatibility between low-precision local updates and high-precision global aggregation. We provide rigorous convergence analysis proving that the framework converges under non-IID data distributions. Experiments across multiple models (ResNet, ViT) and benchmarks (CIFAR-10/100, FEMNIST) demonstrate that our approach reduces communication volume by at least 2.9× compared to the FP32 baseline, without sacrificing model accuracy—thereby significantly enhancing the efficiency of continual learning on resource-constrained edge devices.

Technology Category

Application Category

📝 Abstract

Recent work has shown that 8-bit floating point (FP8) can be used for efficiently training neural networks with reduced computational overhead compared to training in FP32/FP16. In this work, we investigate the use of FP8 training in a federated learning context. This brings not only the usual benefits of FP8 which are desirable for on-device training at the edge, but also reduces client-server communication costs due to significant weight compression. We present a novel method for combining FP8 client training while maintaining a global FP32 server model and provide convergence analysis. Experiments with various machine learning models and datasets show that our method consistently yields communication reductions of at least 2.9x across a variety of tasks and models compared to an FP32 baseline.

Problem

Research questions and friction points this paper is trying to address.

Enabling efficient on-device training with 8-bit floating point (FP8)

Reducing client-server communication costs via FP8 weight compression

Maintaining global FP32 server model while using FP8 for clients

Innovation

Methods, ideas, or system contributions that make the work stand out.

FP8 for efficient on-device training

FP8 reduces client-server communication costs

Combines FP8 client with FP32 server model

🔎 Similar Papers

Enhancing Efficiency in Multidevice Federated Learning through Data Selection