๐ค AI Summary
To address the challenge of joint communication and sensing optimization for autonomous vehicles in dynamic millimeter-wave wireless environments, this paper proposes a QSI-CSI co-driven adaptive OFDM reinforcement learning framework. The method innovatively incorporates Age of Update (AoU) into the reward function to jointly model communication latency, packet loss rate, and velocity sensing resolution; integrates an Integrated Sensing and Communication (ISAC) architecture; and employs actor-critic (A2C) and proximal policy optimization (PPO) algorithms to enable real-time adaptation of modulation order, subcarrier allocation, and frame structure. Simulation results demonstrate that, compared with static configurations, the proposed framework reduces packet loss rate by 37%, improves effective data rate by 2.1ร, and achieves a velocity resolution of 0.08 m/sโsignificantly enhancing the real-time performance and robustness of ISAC systems.
๐ Abstract
Millimeter wave (mmWave)-based orthogonal frequency-division multiplexing (OFDM) stands out as a suitable alternative for high-resolution sensing and high-speed data transmission. To meet communication and sensing requirements, many works propose a static configuration where the wave's hyperparameters such as the number of symbols in a frame and the number of frames in a communication slot are already predefined. However, two facts oblige us to redefine the problem, (1) the environment is often dynamic and uncertain, and (2) mmWave is severely impacted by wireless environments. A striking example where this challenge is very prominent is autonomous vehicle (AV). Such a system leverages integrated sensing and communication (ISAC) using mmWave to manage data transmission and the dynamism of the environment. In this work, we consider an autonomous vehicle network where an AV utilizes its queue state information (QSI) and channel state information (CSI) in conjunction with reinforcement learning techniques to manage communication and sensing. This enables the AV to achieve two primary objectives: establishing a stable communication link with other AVs and accurately estimating the velocities of surrounding objects with high resolution. The communication performance is therefore evaluated based on the queue state, the effective data rate, and the discarded packets rate. In contrast, the effectiveness of the sensing is assessed using the velocity resolution. In addition, we exploit adaptive OFDM techniques for dynamic modulation, and we suggest a reward function that leverages the age of updates to handle the communication buffer and improve sensing. The system is validated using advantage actor-critic (A2C) and proximal policy optimization (PPO). Furthermore, we compare our solution with the existing design and demonstrate its superior performance by computer simulations.