It's-A-Me, Quantum Mario: Scalable Quantum Reinforcement Learning with Multi-Chip Ensembles

📅 2025-08-31

📈 Citations: 0

✨ Influential: 0

career value

251K/year

🤖 AI Summary

To address the limitations of quantum reinforcement learning (QRL) on Noisy Intermediate-Scale Quantum (NISQ) hardware—namely, scarce qubit resources and noise accumulation in complex environments—this paper proposes a multi-chip collaborative QRL framework. The framework employs multiple small-scale quantum convolutional neural networks (QCNNs) operating in parallel to process high-dimensional observations, thereby avoiding information loss from dimensionality reduction on a single chip. Their outputs are aggregated by a classical Double Deep Q-Network (Double DQN), enabling scalable and robust policy learning. Its key innovation lies in the first deep integration of multi-circuit quantum hardware with classical deep RL architecture, preserving NISQ compatibility while enhancing generalization and training stability. Evaluated on the Super Mario Bros task, the method significantly outperforms both classical DQN and single-chip QRL baselines, demonstrating superior effectiveness and scalability.

Technology Category

Application Category

📝 Abstract

Quantum reinforcement learning (QRL) promises compact function approximators with access to vast Hilbert spaces, but its practical progress is slowed by NISQ-era constraints such as limited qubits and noise accumulation. We introduce a multi-chip ensemble framework using multiple small Quantum Convolutional Neural Networks (QCNNs) to overcome these constraints. Our approach partitions complex, high-dimensional observations from the Super Mario Bros environment across independent quantum circuits, then classically aggregates their outputs within a Double Deep Q-Network (DDQN) framework. This modular architecture enables QRL in complex environments previously inaccessible to quantum agents, achieving superior performance and learning stability compared to classical baselines and single-chip quantum models. The multi-chip ensemble demonstrates enhanced scalability by reducing information loss from dimensionality reduction while remaining implementable on near-term quantum hardware, providing a practical pathway for applying QRL to real-world problems.

Problem

Research questions and friction points this paper is trying to address.

Overcoming NISQ-era qubit limitations in quantum reinforcement learning

Partitioning high-dimensional observations across multiple quantum circuits

Enabling scalable QRL on near-term hardware with multi-chip ensembles

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-chip quantum ensemble framework

Quantum Convolutional Neural Networks partitioning

Classical aggregation in DDQN framework

🔎 Similar Papers

Multi-Agent Quantum Reinforcement Learning using Evolutionary Optimization