Architecture Shape Governs QNN Trainability: Jacobian Null Space Growth and Parameter Efficiency

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

238K/year
🤖 AI Summary
This work investigates how the architectural shape of variational quantum circuits—specifically the allocation between qubit count and encoding layers under a fixed encoding budget—profoundly influences trainability. The root cause is identified as structural rank deficiency in the coefficient-matching Jacobian matrix, leading to the proposed concept of “structural gradient starvation”: in serial architectures, the Jacobian nullspace expands unboundedly with increasing parameters, whereas parallel architectures avoid this issue. Through a Fourier-analytic perspective on expressivity, combined with theoretical analysis of the Jacobian, the spectrum of the quantum Fisher information matrix, and kernel space dimensionality, the study proves that parallel architectures maintain a strictly positive minimum singular value of the Jacobian when the number of parameters does not exceed \(2E + 1\). Moreover, adding feature map layers reduces the required parameter count by 1.6–2.2× to achieve \(R^2 \geq 0.95\), substantially enhancing parameter efficiency.
📝 Abstract
Variational quantum circuits with angle encoding implement truncated Fourier series, and architectures arranging $N$ qubits with $L$ encoding layers each -- sharing encoding budget $E = NL$ -- generate identical frequency spectra, identical frequency redundancy, and require the same minimum parameter count for coefficient control. Despite this equivalence, trainability varies substantially with architecture shape $(N,L)$ at fixed $E$. We identify structural rank deficiency of the coefficient matching Jacobian $J$ as the mechanism responsible. For serial single-qubit architectures, we prove $\mathrm{rank}(J) \leq 2L+1$ regardless of parameter count $P$, with $\dim(\ker J) \geq P-(2L+1)$ growing without bound -- a phenomenon we term \emph{structural gradient starvation}: a growing fraction of parameters become structurally decoupled from the loss as $P$ increases at fixed $L$. Parallel architectures avoid this via independent phase trajectories, ensuring $σ_{\min}(J^{(\mathrm{par})}) > 0$ generically for $P \leq 2E+1$, so no parameter lies in $\ker J$. For practitioners, we further show that the two natural routes to increasing parameter count have fundamentally different effects: adding feature map (FM) layers monotonically strengthens the Jacobian QFIM eigenvalue spectrum and achieves $R^2 \geq 0.95$ with $1.6$--$2.2\times$ fewer parameters than adding trainable blocks across all tested architectures, while trainable blocks improve training only through the classical interpolation mechanism with no quantum-specific benefit.
Problem

Research questions and friction points this paper is trying to address.

quantum neural networks
trainability
architecture shape
Jacobian null space
parameter efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

structural gradient starvation
Jacobian rank deficiency
parameter efficiency
parallel vs serial architectures
quantum neural network trainability
🔎 Similar Papers
No similar papers found.