๐ค AI Summary
This work addresses the limitations of existing semantic communication systems in computational efficiency and spatial deployment flexibility, which hinder effective transmission of task-relevant semantics. To overcome these challenges, the authors propose a UAV-enabled distributed electromagnetic neural network architecture, where multiple unmanned aerial vehicles equipped with stacked intelligent metasurfaces (SIMs) collaborate with a ground receiver to directly encode image semantics in the wave domain for downstream recognition tasks. A novel temperature-adaptive gradient optimization algorithm is introduced to mitigate gradient vanishing, thereby enhancing training stability and deployment adaptability. Experimental results demonstrate that the proposed approach achieves an average 8% improvement in image recognition accuracy over single-SIM baselines across multiple datasets, validating its efficacy in task-oriented semantic communication.
๐ Abstract
Semantic communications (SemCom) is a promising paradigm that prioritizes the transmission of task-relevant information, thereby enabling superior communication efficiency over traditional bit-centric systems. However, most existing SemCom systems face critical limitations in computational efficiency and spatial flexibility. To overcome these limitations, we propose a novel unmanned aerial vehicles (UAV)-enabled distributed electromagnetic neural network (EMNN) for a task-oriented SemCom system. Specifically, the proposed distributed EMNN is composed of multiple UAV-mounted stacked intelligent metasurfaces (SIM) and a ground receiving station (GRS), where multiple SIMs collaboratively encode image semantics in the wave domain, and the GRS performs decoding based on the received power distribution. Moreover, we employ a temperature-adaptive gradient optimization algorithm to train the distributed EMNN, which mitigates gradient vanishing and enhances learning stability. Finally, the numerical simulation results demonstrate the effectiveness of distributed EMNN in image recognition task-oriented SemCom, achieving an average $8\%$ accuracy improvement over the single-SIM baseline across multiple datasets.