🤖 AI Summary
This work addresses adaptive 360° video streaming for mobile VR users in UAV-assisted 5G networks. We formulate a constrained Markov decision process (CMDP) framework that jointly optimizes video quality, buffer stability, and bitrate switching smoothness, incorporating a dynamic cost adjustment mechanism. Methodologically, the approach integrates proximal policy optimization (PPO)-based deep reinforcement learning, millimeter-wave (mmWave) high-frequency communication, multi-layer tile-based video encoding, multi-buffer management, and cooperative transmission among UAV-mounted base stations. Compared to baseline schemes, our solution achieves an average PSNR gain of approximately 2 dB, reduces stalling time by 80%, and decreases quality fluctuation by 57%, significantly enhancing user-perceived quality of experience (QoE). The key contributions lie in: (i) a novel CMDP formulation specifically tailored for mobile VR streaming, and (ii) a multidimensional resource coordination mechanism enabling dynamic, joint control of communication, computation, and content delivery resources.
📝 Abstract
We propose ASL360, an adaptive deep reinforcement learning-based scheduler for on-demand 360° video streaming to mobile VR users in next generation wireless networks. We aim to maximize the overall Quality of Experience (QoE) of the users served over a UAV-assisted 5G wireless network. Our system model comprises a macro base station (MBS) and a UAV-mounted base station which both deploy mm-Wave transmission to the users. The 360° video is encoded into dependent layers and segmented tiles, allowing a user to schedule downloads of each layer's segments. Furthermore, each user utilizes multiple buffers to store the corresponding video layer's segments. We model the scheduling decision as a Constrained Markov Decision Process (CMDP), where the agent selects Base or Enhancement layers to maximize the QoE and use a policy gradient-based method (PPO) to find the optimal policy. Additionally, we implement a dynamic adjustment mechanism for cost components, allowing the system to adaptively balance and prioritize the video quality, buffer occupancy, and quality change based on real-time network and streaming session conditions. We demonstrate that ASL360 significantly improves the QoE, achieving approximately 2 dB higher average video quality, 80% lower average rebuffering time, and 57% lower video quality variation, relative to competitive baseline methods. Our results show the effectiveness of our layered and adaptive approach in enhancing the QoE in immersive videostreaming applications, particularly in dynamic and challenging network environments.