🤖 AI Summary
Current large language models (LLMs) face two critical bottlenecks in quantum circuit generation: imprecise parameter configuration and insufficient domain-specific quantum knowledge. To address these, this work proposes a tool-augmented agent framework that integrates an external quantum simulator for real-time validation, employs a hierarchical reward mechanism to semantically align LLM outputs with quantum physical principles, and applies reinforcement learning to jointly optimize gate sequences and continuous parameters. The method substantially improves syntactic correctness and functional fidelity of generated circuits. On a 4B-parameter model, it achieves Pass@1 = 99.31% and Pass@10 = 100%, outperforming strong baselines including GPT-4o, GPT-5, and DeepSeek-V3. Its core contribution is the first end-to-end verifiable and optimizable, task-customized quantum assembly generation paradigm.
📝 Abstract
Designing and optimizing task-specific quantum circuits are crucial to leverage the advantage of quantum computing. Recent large language model (LLM)-based quantum circuit generation has emerged as a promising automatic solution. However, the fundamental challenges remain unaddressed: (i) parameterized quantum gates require precise numerical values for optimal performance, which also depend on multiple aspects, including the number of quantum gates, their parameters, and the layout/depth of the circuits. (ii) LLMs often generate low-quality or incorrect quantum circuits due to the lack of quantum domain-specific knowledge. We propose QUASAR, an agentic reinforcement learning (RL) framework for quantum circuits generation and optimization based on tool-augmented LLMs. To align the LLM with quantum-specific knowledge and improve the generated quantum circuits, QUASAR designs (i) a quantum circuit verification approach with external quantum simulators and (ii) a sophisticated hierarchical reward mechanism in RL training. Extensive evaluation shows improvements in both syntax and semantic performance of the generated quantum circuits. When augmenting a 4B LLM, QUASAR has achieved the validity of 99.31% in Pass@1 and 100% in Pass@10, outperforming industrial LLMs of GPT-4o, GPT-5 and DeepSeek-V3 and several supervised-fine-tuning (SFT)-only and RL-only baselines.