๐ค AI Summary
This work addresses the challenges of reliability, efficiency, and motion smoothness in robot navigation within dynamic environments by proposing Q-SpiRL, a novel framework that integrates spiking neural networks with variational quantum feature maps to construct a quantum-enhanced reinforcement learning agent. The approach leverages a synergistic combination of a quantum-enhanced multilayer perceptron (QMLP) and a quantum spiking neural network (QSNN) to jointly perform spatiotemporal processing and quantum feature encoding, enabling end-to-end policy learning within a unified training and evaluation pipeline deployed on real IBM quantum hardware. Experimental results demonstrate that QSNN achieves a 99% navigation success rate in complex 40ร40 environments while maintaining high path efficiency and motion smoothness, thereby validating the feasibility and superiority of the quantumโspiking hybrid architecture on actual quantum devices.
๐ Abstract
Adaptive robot navigation in dynamic environments requires policies that can reach the target reliably while producing efficient and stable trajectories. This paper presents Q-SpiRL, a quantum spiking reinforcement learning framework for obstacle-aware robot navigation. The framework develops and evaluates five agent families: tabular Q-learning, classical MLP, classical SNN, quantum-enhanced MLP (QMLP), and quantum-enhanced spiking neural network (QSNN). While all models are implemented under a unified training and evaluation pipeline, the QSNN is the central architecture of interest, as it combines spike-based temporal processing with variational quantum feature transformation. Experiments are conducted across three grid-world environments of increasing size, namely 20x20, 30x30, and 40x40, with both static and dynamic obstacles. Performance is assessed using success rate, success-weighted path length, path length, and turn rate under deterministic inference. Results show that QSNN achieves the strongest overall trade-off between task completion, trajectory efficiency, and motion smoothness, reaching up to 99% success rate while maintaining high path efficiency in the most challenging setting. Execution on IBM quantum hardware further demonstrates the feasibility of deploying the proposed hybrid policy under real-device conditions.