Efficient Generation of Parameterised Quantum Circuits from Large Texts

📅 2025-05-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficiently compiling large-scale natural language text into parameterized quantum circuits (PQCs). We propose a quantum-aware modeling framework grounded in pregroup grammar and symmetric monoidal category theory, enabling end-to-end compilation of texts up to 6,410 tokens into interpretable, tree-structured Discocirc quantum circuits—the first such scalable and semantically transparent quantum NLP representation for long texts. Our method rigorously unifies linguistic compositionality with quantum operation isomorphism, leveraging the Lambeq Gen II toolkit to generate PQCs comprising thousands of quantum gates; the implementation is open-sourced and integrated. Empirical evaluation demonstrates strong performance on downstream tasks including text classification and natural language inference, validating both expressivity and practical utility. This work establishes a new paradigm for quantum natural language processing that bridges theoretical rigor—rooted in categorical quantum mechanics—with engineering feasibility and scalability.

Technology Category

Application Category

📝 Abstract
Quantum approaches to natural language processing (NLP) are redefining how linguistic information is represented and processed. While traditional hybrid quantum-classical models rely heavily on classical neural networks, recent advancements propose a novel framework, DisCoCirc, capable of directly encoding entire documents as parameterised quantum circuits (PQCs), besides enjoying some additional interpretability and compositionality benefits. Following these ideas, this paper introduces an efficient methodology for converting large-scale texts into quantum circuits using tree-like representations of pregroup diagrams. Exploiting the compositional parallels between language and quantum mechanics, grounded in symmetric monoidal categories, our approach enables faithful and efficient encoding of syntactic and discourse relationships in long and complex texts (up to 6410 words in our experiments) to quantum circuits. The developed system is provided to the community as part of the augmented open-source quantum NLP package lambeq Gen II.
Problem

Research questions and friction points this paper is trying to address.

Efficiently convert large texts to quantum circuits
Encode syntactic and discourse relationships quantumly
Improve interpretability in quantum NLP models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Encodes documents as parameterised quantum circuits
Uses tree-like pregroup diagram representations
Leverages symmetric monoidal categories for encoding