🤖 AI Summary
This work investigates whether instantaneous quantum polynomial (IQP) quantum circuit Born machines (QCBMs) suffer from barren plateaus—characterized by vanishing gradients—during early training and examines the overlap between trainable regions and those that are classically hard to simulate. By analyzing the gradient variance of the maximum mean discrepancy (MMD) loss and leveraging the spectral properties of kernel functions together with circuit architecture, the authors derive, for the first time, an analytical expression along with polynomial upper and lower bounds for this variance. The study reveals that employing low-weight-bias kernels combined with small-variance Gaussian initialization yields polynomially decaying—rather than exponentially vanishing—gradient variance, thereby mitigating barren plateaus. Moreover, the sparse structure of IQP-QCBMs enables the generation of classically intractable probability distributions within these trainable regimes.
📝 Abstract
Instantaneous quantum polynomial quantum circuit Born machines (IQP-QCBMs) have been proposed as quantum generative models with a classically tractable training objective based on the maximum mean discrepancy (MMD) and a potential quantum advantage motivated by sampling-complexity arguments, making them an exciting model worth deeper investigation. While recent works have further proven the universality of a (slightly generalized) model, the next immediate question pertains to its trainability, i.e., whether it suffers from the exponentially vanishing loss gradients, known as the barren plateau issue, preventing effective use, and how regimes of trainability overlap with regimes of possible quantum advantage. Here, we provide significant strides in these directions. To study the trainability at initialization, we analytically derive closed-form expressions for the variances of the partial derivatives of the MMD loss function and provide general upper and lower bounds. With uniform initialization, we show that barren plateaus depend on the generator set and the spectrum of the chosen kernel. We identify regimes in which low-weight-biased kernels avoid exponential gradient suppression in structured topologies. Also, we prove that a small-variance Gaussian initialization ensures polynomial scaling for the gradient under mild conditions. As for the potential quantum advantage, we further argue, based on previous complexity-theoretic arguments, that sparse IQP families can output a probability distribution family that is classically intractable, and that this distribution remains trainable at initialization at least at lower-weight frequencies.