🤖 AI Summary
In cellular vehicle-to-network (C-V2N) systems, joint optimization of service task placement and edge resource scaling is NP-hard, tightly coupled, and subject to stringent latency requirements. Method: This paper proposes the first Deep Hybrid Policy Gradient (DHPG) algorithm supporting a hybrid action space—simultaneously handling discrete task deployment decisions and continuous resource scaling control—within an end-to-end, real-time online deep reinforcement learning (DRL) framework. The approach is validated using realistic C-V2N traffic traces. Contribution/Results: Under a strict 99th-percentile end-to-end latency constraint, the method achieves significantly higher resource utilization than state-of-the-art (SOTA) baselines, while maintaining decision-making latency in the millisecond range—demonstrating practical deployability in production C-V2N environments.
📝 Abstract
Cellular-Vehicle-to-Everything (C-V2X) is currently at the forefront of the digital transformation of our society. By enabling vehicles to communicate with each other and with the traffic environment using cellular networks, we redefine transportation, improving road safety and transportation services, increasing efficiency of vehicular traffic flows, and reducing environmental impact. To effectively facilitate the provisioning of Cellular Vehicular-to-Network (C-V2N) services, we tackle the interdependent problems of service task placement and scaling of edge resources. Specifically, we formulate the joint problem and prove that it is not computationally tractable. To address its complexity we propose Deep Hybrid Policy Gradient (DHPG), a new Deep Reinforcement Learning (DRL) approach that operates in hybrid action spaces, enabling holistic decision-making and enhancing overall performance. We evaluated the performance of DHPG using simulations with a real-world C-V2N traffic dataset, comparing it to several state-of-the-art (SoA) solutions. DHPG outperforms these solutions, guaranteeing the $99^{th}$ percentile of C-V2N service delay target, while simultaneously optimizing the utilization of computing resources. Finally, time complexity analysis is conducted to verify that the proposed approach can support real-time C-V2N services.