🤖 AI Summary
Quantum neural networks (QNNs) do not exhibit Gaussian process behavior under random initialization, impeding a unified theoretical characterization of their training dynamics and generalization. Method: We establish a unified analytical framework for QNN loss landscapes, leveraging random matrix theory and quantum circuit algebra to derive exact analytical distributions of gradients and local minima. Contribution/Results: First, we prove that QNN initialization follows a Wishart process—not a Gaussian process—and rigorously characterize the necessary and sufficient conditions for convergence to the Gaussian limit. Second, we introduce a physically measurable trainability criterion centered on “degrees of freedom,” quantitatively linking architectural algebraic properties to optimization hardness. Third, we unify and extend the barren plateau phenomenon, yielding experimentally verifiable theoretical guidance and practical design metrics for QNNs.
📝 Abstract
Classical neural networks with random initialization famously behave as Gaussian processes in the limit of many neurons, which allows one to completely characterize their training and generalization behavior. No such general understanding exists for quantum neural networks (QNNs), which -- outside of certain special cases -- are known to not behave as Gaussian processes when randomly initialized. We here prove that QNNs and their first two derivatives instead generally form what we call"Wishart processes,"where certain algebraic properties of the network determine the hyperparameters of the process. This Wishart process description allows us to, for the first time: give necessary and sufficient conditions for a QNN architecture to have a Gaussian process limit; calculate the full gradient distribution, generalizing previously known barren plateau results; and calculate the local minima distribution of algebraically constrained QNNs. Our unified framework suggests a certain simple operational definition for the"trainability"of a given QNN model using a newly introduced, experimentally accessible quantity we call the"degrees of freedom"of the network architecture.