SQARL: A Size-Agnostic Reinforcement Learning approach for Circuit Allocation in Distributed Quantum Architectures

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This work addresses the challenge of efficiently allocating quantum circuits in distributed quantum computing to minimize cross-core communication overhead. The authors propose a Transformer-based deep reinforcement learning approach for universal qubit allocation that generalizes across arbitrary numbers of qubits and processor topologies without requiring retraining for specific hardware architectures. By integrating sequence modeling with policy optimization, the method significantly enhances generalization capability. Experimental results demonstrate that the proposed approach reduces allocation costs by 33% on Cuccaro adder circuits and by 25% on average across random circuits compared to the current state-of-the-art heuristic HQA, substantially narrowing the performance gap between learning-based methods and handcrafted heuristics.

📝 Abstract

The scaling of quantum processors is currently limited by technical challenges such as decoherence and cross-talk. As the number of qubits grows, interference increases the computational noise. Distributed quantum computing addresses these limitations by interconnecting smaller, easier-to-handle quantum processors (cores), but it introduces the challenge of minimizing slow, error-prone inter-core communication. The task of distributing quantum circuits across cores while minimizing communication costs is known as the Qubit Allocation problem. This work focuses on developing a deep learning approach to this problem, emphasizing flexibility to quantum hardware topology and improving state-of-the-art performance. Heuristic and non-learning algorithms, such as the Hungarian Qubit Allocation (HQA), currently represent the state of the art. Reinforcement Learning (RL) approaches leverage learned allocation policies but often lack flexibility, requiring retraining when hardware configurations change, and they fall short of the solution quality achieved by non-learning methods. However, learning mechanisms could outperform human-crafted heuristics. To overcome these limitations, this work proposes a flexible, transformer-based architecture that can handle arbitrary numbers of qubits and cores without retraining. Results show that the trained policy consistently outperforms the previous RL state of the art and narrows the gap between RL and HQA for the most common circuits. It achieves a 33% reduction in allocation cost relative to the HQA for the Cuccaro Adder and 25% on average for random circuits. These findings show that learning-based approaches can effectively match the performance of hand-crafted heuristics, a crucial step towards their application in real-world scenarios.

Problem

Research questions and friction points this paper is trying to address.

Qubit Allocation

Distributed Quantum Computing

Inter-core Communication

Quantum Circuit Allocation

Quantum Hardware Topology

Innovation

Methods, ideas, or system contributions that make the work stand out.

Size-Agnostic

Reinforcement Learning

Quantum Circuit Allocation