🤖 AI Summary
Real-time deep learning inference on resource-constrained edge devices faces severe energy consumption and memory bottlenecks. To address this, we propose a wireless broadcast–driven, RF-domain native complex matrix–vector multiplication architecture—the first to jointly enable over-the-air wireless distribution of model weights and direct execution of complex linear operations at the RF physical layer, thereby decoupling computation and storage across devices. Implemented on a software-defined radio platform, our design integrates over-the-air weight broadcasting, analog-domain complex multiply-accumulate (MAC) operations, and dedicated signal processing circuitry. Evaluated on image classification, it achieves 95.7% accuracy with only 6.0 fJ/MAC per client and an energy efficiency of 165.8 TOPS/W—exceeding conventional digital implementations by over two orders of magnitude—thus significantly advancing the energy-efficiency frontier for edge AI.
📝 Abstract
Modern edge devices, such as cameras, drones, and Internet-of-Things nodes, rely on deep learning to enable a wide range of intelligent applications, including object recognition, environment perception, and autonomous navigation. However, deploying deep learning models directly on the often resource-constrained edge devices demands significant memory footprints and computational power for real-time inference using traditional digital computing architectures. In this paper, we present WISE, a novel computing architecture for wireless edge networks designed to overcome energy constraints in deep learning inference. WISE achieves this goal through two key innovations: disaggregated model access via wireless broadcasting and in-physics computation of general complex-valued matrix-vector multiplications directly at radio frequency. Using a software-defined radio platform with wirelessly broadcast model weights over the air, we demonstrate that WISE achieves 95.7% image classification accuracy with ultra-low operation power of 6.0 fJ/MAC per client, corresponding to a computation efficiency of 165.8 TOPS/W. This approach enables energy-efficient deep learning inference on wirelessly connected edge devices, achieving more than two orders of magnitude improvement in efficiency compared to traditional digital computing.