🤖 AI Summary
This work proposes a novel approach to quantum circuit optimization by dynamically orchestrating optimization passes using reinforcement learning. Unlike conventional methods that rely on fixed or user-specified sequences of optimization steps, the proposed framework trains an agent to adaptively compose two-qubit gate reduction operations from PyTKET, aiming to maximize gate reduction rates tailored to individual circuits. Evaluated on a diverse set of benchmark circuits, the method achieves average and median two-qubit gate reduction rates of 57.7% and 56.7%, respectively—significantly outperforming the current best default optimization sequence, which yields 41.8% and 50.0%. This represents the first application of reinforcement learning to the dynamic scheduling of quantum circuit optimizations, enabling circuit-specific, highly efficient customization.
📝 Abstract
Many quantum software development kits provide a suite of circuit optimisation passes. These passes have been highly optimised and tested in isolation. However, the order in which they are applied is left to the user, or else defined in general-purpose default pass sequences. While general-purpose sequences miss opportunities for optimisation which are particular to individual circuits, designing pass sequences bespoke to particular circuits requires exceptional knowledge about quantum circuit design and optimisation. Here we propose and demonstrate training a reinforcement learning agent to compose optimisation-pass sequences. In particular the agent's action space consists of passes for two-qubit gate count reduction used in default PyTKET pass sequences. For the circuits in our diverse test set, the (mean, median) fraction of two-qubit gates removed by the agent is $(57.7\%, \ 56.7 \%)$, compared to $(41.8 \%, \ 50.0 \%)$ for the next best default pass sequence.