Automated Design of Structured Variational Quantum Circuits with Reinforcement Learning

๐Ÿ“… 2025-07-21
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Variational quantum algorithms (VQAs) suffer from ansatz design reliance on manual, heuristic engineering. Method: This work models structured variational quantum circuit design as a sequential decision-making problem and introduces Proximal Policy Optimization (PPO)โ€”the first application of reinforcement learning (RL) to this task. We propose a dual-path approach: RLVQC-Block, built upon the QAOA framework, enhances generalization; RLVQC-Global relaxes graph-structural constraints to minimize circuit depth. Both methods use experimentally measurable quantum states as state observations and incorporate graph-theoretic priors from the underlying QUBO problem. Contribution/Results: On diverse graph-based QUBO instances, RLVQC-Block consistently outperforms QAOA while maintaining comparable circuit depth; RLVQC-Global achieves significant depth reduction but with modest performance trade-offs. This work establishes a new paradigm for automated, interpretable, and task-oriented quantum circuit synthesis.

Technology Category

Application Category

๐Ÿ“ Abstract
Variational Quantum Algorithms (VQAs) are among the most promising approaches for leveraging near-term quantum hardware, yet their effectiveness strongly depends on the design of the underlying circuit ansatz, which is typically constructed with heuristic methods. In this work, we represent the synthesis of variational quantum circuits as a sequential decision-making problem, where gates are added iteratively in order to optimize an objective function, and we introduce two reinforcement learning-based methods, RLVQC Global and RLVQC Block, tailored to combinatorial optimization problems. RLVQC Block creates ansatzes that generalize the Quantum Approximate Optimization Algorithm (QAOA), by discovering a two-qubits block that is applied to all the interacting qubit pairs. While RLVQC Global further generalizes the ansatz and adds gates unconstrained by the structure of the interacting qubits. Both methods adopt the Proximal Policy Optimization (PPO) algorithm and use empirical measurement outcomes as state observations to guide the agent. We evaluate the proposed methods on a broad set of QUBO instances derived from classical graph-based optimization problems. Our results show that both RLVQC methods exhibit strong results with RLVQC Block consistently outperforming QAOA and generally surpassing RLVQC Global. While RLVQC Block produces circuits with depth comparable to QAOA, the Global variant is instead able to find significantly shorter ones. These findings suggest that reinforcement learning methods can be an effective tool to discover new ansatz structures tailored for specific problems and that the most effective circuit design strategy lies between rigid predefined architectures and completely unconstrained ones, offering a favourable trade-off between structure and adaptability.
Problem

Research questions and friction points this paper is trying to address.

Automating variational quantum circuit design using reinforcement learning
Optimizing circuit ansatz for combinatorial problems via RL methods
Balancing predefined and adaptable circuit structures for better performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning optimizes variational quantum circuits
Block-based ansatz generalizes QAOA for better performance
Global variant discovers shorter depth circuit designs
๐Ÿ”Ž Similar Papers
No similar papers found.