🤖 AI Summary
To address low throughput in device-to-device (D2D) communications under spectrum scarcity, slow convergence, and the curse of dimensionality inherent in conventional dynamic spectrum access (DSA) strategies, this paper proposes a novel D2D spectrum access framework integrating ambient backscatter communication (AmBC) and quantum reinforcement learning (QRL). The method introduces parameterized quantum circuits into the D2D decision-making process for the first time, leveraging quantum superposition and entanglement to enhance policy representation capability—enabling joint optimization of three distinct actions: spectrum sensing, active transmission, and backscatter modulation. Under high-load shared-spectrum conditions, the proposed scheme significantly improves average D2D throughput, accelerates convergence by 3.2×, reduces model parameters by 87%, and substantially lowers training complexity. This work delivers a scalable, quantum-intelligent solution for low-power, high-efficiency D2D communications.
📝 Abstract
Spectrum access is an essential problem in device-to-device (D2D) communications. However, with the recent growth in the number of mobile devices, the wireless spectrum is becoming scarce, resulting in low spectral efficiency for D2D communications. To address this problem, this paper aims to integrate the ambient backscatter communication technology into D2D devices to allow them to backscatter ambient RF signals to transmit their data when the shared spectrum is occupied by mobile users. To obtain the optimal spectrum access policy, i.e., stay idle or access the shared spectrum and perform active transmissions or backscattering ambient RF signals for transmissions, to maximize the average throughput for D2D users, deep reinforcement learning (DRL) can be adopted. However, DRL-based solutions may require long training time due to the curse of dimensionality issue as well as complex deep neural network architectures. For that, we develop a novel quantum reinforcement learning (RL) algorithm that can achieve a faster convergence rate with fewer training parameters compared to DRL thanks to the quantum superposition and quantum entanglement principles. Specifically, instead of using conventional deep neural networks, the proposed quantum RL algorithm uses a parametrized quantum circuit to approximate an optimal policy. Extensive simulations then demonstrate that the proposed solution not only can significantly improve the average throughput of D2D devices when the shared spectrum is busy but also can achieve much better performance in terms of convergence rate and learning complexity compared to existing DRL-based methods.