🤖 AI Summary
This work systematically investigates the impact of 5G, WiFi, and Ethernet interfaces on federated learning (FL) performance, identifying 5G uplink transmission latency as the dominant factor governing FL convergence time. Leveraging a custom-built, open-source 5G-NR standalone (SA) testbed—comprising Raspberry Pi clients, an O-RAN-compliant base station, and software-defined radios—and integrating the Flower FL framework with a full-stack FL monitoring toolkit, we quantitatively demonstrate that 5G uplink model transmission accounts for 23% of per-round duration—33.3× and 17.8× longer than Ethernet and WiFi, respectively—severely exacerbating straggler effects and constituting the primary bottleneck in 5G-FL deployment. Key contributions include: (1) the first end-to-end FL experimental platform tailored for 5G-NR SA; (2) empirical evidence establishing uplink bandwidth limitation as the root cause of slow FL convergence in 5G; and (3) open-sourcing of the complete toolchain and configurations to enable reproducible, cross-network FL performance evaluation.
📝 Abstract
Federated Learning (FL) deployments using IoT devices is an area that is poised to significantly benefit from advances in NextG wireless. In this paper, we deploy a FL application using a 5G-NR Standalone (SA) testbed with open-source and Commercial Off-the-Shelf (COTS) components. The 5G testbed architecture consists of a network of resource-constrained edge devices, namely Raspberry Pi's, and a central server equipped with a Software Defined Radio (SDR) and running O-RAN software. Our testbed allows edge devices to communicate with the server using WiFi and Ethernet, instead of 5G. FL is deployed using the Flower FL framework, for which we developed a comprehensive instrumentation tool to collect and analyze diverse communications and machine learning performance metrics including: model aggregation time, downlink transmission time, training time, and uplink transmission time. Leveraging these measurements, we perform a comparative analysis of the FL application across three network interfaces: 5G, WiFi, and Ethernet. Our experimental results suggest that, on 5G, the uplink model transfer time is a significant factor in convergence time of FL. In particular, we find that the 5G uplink contributes to roughly 23% of the duration of one average communication round when using all edge devices in our testbed. When comparing the uplink time of the 5G testbed, we find that it is 33.3x higher than Ethernet and 17.8x higher than WiFi. Our results also suggest that 5G exacerbates the well-known straggler effect. For reproducibility, we have open-sourced our FL application, instrumentation tools, and testbed configuration.