🤖 AI Summary
This work addresses the joint optimization of sensor-actuator pairing and communication-computation resource allocation in multi-SC3 (Sensing–Communication–Computation–Control) closed-loop systems for autonomous task execution in hazardous environments envisioned for 6G networks. To tackle this challenge, the authors propose LOAC, a novel framework that integrates learning and optimization: it leverages deep neural networks to generate candidate pairings, employs mixed-integer nonlinear programming for resource allocation, and utilizes an actor-critic reinforcement learning architecture to iteratively refine the overall policy. Experimental results demonstrate that LOAC achieves near-optimal solutions with low computational complexity, significantly reducing control costs and enhancing the end-to-end performance of SC3 closed loops.
📝 Abstract
In hazardous environments, sensors and actuators can be deployed to see and operate on behalf of humans, enabling safe and efficient task execution. Functioning as a neural center, the edge information hub (EIH), which integrates communication and computing capabilities, coordinates these sensors and actuators into sensing-communication-computing-control (SC3) closed loops to enable autonomous operations. From a system-level optimization perspective, this paper addresses the problem of joint sensor-actuator pairing and resource allocation across multiple SC3 closed loops. To tackle the resulting mixed-integer nonlinear programming problem, we develop a learning-optimization-integrated actor-critic (LOAC) framework. In this framework, a deep neural network-based actor generates pairing candidates, while an optimization-based critic subsequently allocates communication and computing resources. The actor is then iteratively refined through feedback from the critic. Simulation results demonstrate that the LOAC framework achieves near-optimal solutions with low computational complexity, offering significant performance gains in reducing control cost.