🤖 AI Summary
This work addresses the synthesis of Clifford circuits on fully connected quantum devices by proposing a scalable reinforcement learning approach aimed at minimizing the number of two-qubit gates. The method incrementally reduces the symplectic matrix representation of a Clifford transformation to the identity matrix, employing a curriculum learning strategy built from random walks originating at the identity. A novel neural network architecture—equivariant under qubit relabeling and independent of system size—is introduced to enable a unified policy deployment across varying numbers of qubits. Experimental results demonstrate that 99.2% of six-qubit instances are solved optimally within seconds, and on unseen 30-qubit benchmarks with thousands of Clifford gates, the approach significantly outperforms Qiskit’s Aaronson–Gottesman and greedy synthesizers in terms of average two-qubit gate count.
📝 Abstract
We consider the problem of synthesizing Clifford quantum circuits for devices with all-to-all qubit connectivity. We approach this task as a reinforcement learning problem in which an agent learns to discover a sequence of elementary Clifford gates that reduces a given symplectic matrix representation of a Clifford circuit to the identity. This formulation permits a simple learning curriculum based on random walks from the identity. We introduce a novel neural network architecture that is equivariant to qubit relabelings of the symplectic matrix representation, and which is size-agnostic, allowing a single learned policy to be applied across different qubit counts without circuit splicing or network reparameterization. On six-qubit Clifford circuits, the largest regime for which optimal references are available, our agent finds circuits within one two-qubit gate of optimality in milliseconds per instance, and finds optimal circuits in 99.2% of instances within seconds per instance. After continued training on ten-qubit instances, the agent scales to unseen Clifford tableaus with up to thirty qubits, including targets generated from circuits with over a thousand Clifford gates, where it achieves lower average two-qubit gate counts than Qiskit's Aaronson-Gottesman and greedy Clifford synthesizers.