Goal-oriented Transmission Scheduling: Structure-guided DRL with a Unified Dual On-policy and Off-policy Approach

📅 2025-01-21

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

This paper addresses the transmission scheduling problem in goal-oriented wireless communication under multi-device–multi-channel settings, where high-dimensional state–action spaces necessitate joint optimization of Age of Information (AoI) and channel quality to maximize long-term system performance. We first establish the asymptotic convexity and monotonicity of the value function under the joint AoI–channel state—a theoretical contribution enabling structural exploitation in policy design. Building on this, we propose SUDO-DRL: a Structure-Guided Unified Dual-Mode Deep Reinforcement Learning framework integrating on-policy/off-policy hybrid training with structured policy evaluation. Experiments demonstrate a 45% improvement in system performance and 40% faster convergence compared to baselines. Moreover, SUDO-DRL significantly outperforms pure on-policy or off-policy approaches in large-scale deployments, validating its strong scalability and practical applicability for real-world goal-oriented networks.

Technology Category

Application Category

📝 Abstract

Goal-oriented communications prioritize application-driven objectives over data accuracy, enabling intelligent next-generation wireless systems. Efficient scheduling in multi-device, multi-channel systems poses significant challenges due to high-dimensional state and action spaces. We address these challenges by deriving key structural properties of the optimal solution to the goal-oriented scheduling problem, incorporating Age of Information (AoI) and channel states. Specifically, we establish the monotonicity of the optimal state value function (a measure of long-term system performance) w.r.t. channel states and prove its asymptotic convexity w.r.t. AoI states. Additionally, we derive the monotonicity of the optimal policy w.r.t. channel states, advancing the theoretical framework for optimal scheduling. Leveraging these insights, we propose the structure-guided unified dual on-off policy DRL (SUDO-DRL), a hybrid algorithm that combines the stability of on-policy training with the sample efficiency of off-policy methods. Through a novel structural property evaluation framework, SUDO-DRL enables effective and scalable training, addressing the complexities of large-scale systems. Numerical results show SUDO-DRL improves system performance by up to 45% and reduces convergence time by 40% compared to state-of-the-art methods. It also effectively handles scheduling in much larger systems, where off-policy DRL fails and on-policy benchmarks exhibit significant performance loss, demonstrating its scalability and efficacy in goal-oriented communications.

Problem

Research questions and friction points this paper is trying to address.

Complex Wireless Systems

Communication Scheduling

System Performance Optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

SUDO-DRL

Communication Scheduling

Performance Enhancement

🔎 Similar Papers

Hierarchical Reinforcement Learning Empowered Task Offloading in V2I Networks