🤖 AI Summary
To address the challenge of balancing model utility against communication and computational overhead for resource-constrained IoT devices in federated learning (FL), this paper proposes an adaptive model decomposition and quantization FL framework. The method introduces a device-aware joint decomposition–quantization co-design mechanism that dynamically allocates subnetwork size and quantization bit-width according to each client’s on-device computational capacity. It employs server-side unified quantization to alleviate client-side computation burden and incorporates public-data regularization to enhance cross-client knowledge sharing. Theoretical analysis establishes global convergence guarantees. Experiments demonstrate that, compared to baseline FL, the framework achieves an 8.43× speedup in quantization latency, reduces on-device computation time by 1.5×, and shortens end-to-end training time by 1.36×. These improvements significantly enhance practicality and scalability in heterogeneous edge environments.
📝 Abstract
Federated Learning (FL) allows collaborative training among multiple devices without data sharing, thus enabling privacy-sensitive applications on mobile or Internet of Things (IoT) devices, such as mobile health and asset tracking. However, designing an FL system with good model utility that works with low computation/communication overhead on heterogeneous, resource-constrained mobile/IoT devices is challenging. To address this problem, this paper proposes FedX, a novel adaptive model decomposition and quantization FL system for IoT. To balance utility with resource constraints on IoT devices, FedX decomposes a global FL model into different sub-networks with adaptive numbers of quantized bits for different devices. The key idea is that a device with fewer resources receives a smaller sub-network for lower overhead but utilizes a larger number of quantized bits for higher model utility, and vice versa. The quantization operations in FedX are done at the server to reduce the computational load on devices. FedX iteratively minimizes the losses in the devices' local data and in the server's public data using quantized sub-networks under a regularization term, and thus it maximizes the benefits of combining FL with model quantization through knowledge sharing among the server and devices in a cost-effective training process. Extensive experiments show that FedX significantly improves quantization times by up to 8.43X, on-device computation time by 1.5X, and total end-to-end training time by 1.36X, compared with baseline FL systems. We guarantee the global model convergence theoretically and validate local model convergence empirically, highlighting FedX's optimization efficiency.