๐ค AI Summary
This work explores the potential of quantum computing in hierarchical reinforcement learning to enhance performance and parameter efficiency while addressing the bottleneck in option-value estimation. Building upon the options-critic framework, we introduce variational quantum circuits for the first time in a systematic manner, constructing a quantum-classical hybrid agent that employs quantum circuits for feature extraction, option-value functions, termination functions, and intra-option policies. The proposed approach significantly reduces the number of trainable parametersโby up to 66%โand outperforms classical baselines on standard benchmarks. Furthermore, our analysis identifies key design factors in quantum circuits that critically influence performance, thereby establishing a new paradigm for efficient hybrid hierarchical reinforcement learning.
๐ Abstract
Reinforcement learning is one of the most challenging learning paradigms where efficacy and efficiency gains are extremely valuable. Hierarchical reinforcement learning is a variant that leverages temporal abstraction to structure decision-making. While parametrized quantum computations have shown success in non-hierarchical reinforcement learning, whether these advantages adapt to hierarchical decision-making remains a critical open question. In this work, we develop a hybrid hierarchical agent based on the option-critic architecture. This hybrid agent substitutes classical components with variational quantum circuits for feature extractors, option-value functions, termination functions, and intra-option policies. Evaluated on standard benchmarking environments, results show that a hybrid agent utilizing a quantum feature extractor can outperform classical baselines while saving up to 66\% trainable parameters. We also identify an architectural bottleneck that quantum option-value estimation severely degrades performance. Further ablation studies reveal how architectural choices of the quantum circuits affect performance. Our work establishes design principles for parameter-efficient hybrid hierarchical agents.