A Unified Theory of Quantum Neural Network Loss Landscapes

📅 2024-08-21

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

227K/year

🤖 AI Summary

Quantum neural networks (QNNs) do not exhibit Gaussian process behavior under random initialization, impeding a unified theoretical characterization of their training dynamics and generalization. Method: We establish a unified analytical framework for QNN loss landscapes, leveraging random matrix theory and quantum circuit algebra to derive exact analytical distributions of gradients and local minima. Contribution/Results: First, we prove that QNN initialization follows a Wishart process—not a Gaussian process—and rigorously characterize the necessary and sufficient conditions for convergence to the Gaussian limit. Second, we introduce a physically measurable trainability criterion centered on “degrees of freedom,” quantitatively linking architectural algebraic properties to optimization hardness. Third, we unify and extend the barren plateau phenomenon, yielding experimentally verifiable theoretical guidance and practical design metrics for QNNs.

Technology Category

Application Category

📝 Abstract

Classical neural networks with random initialization famously behave as Gaussian processes in the limit of many neurons, which allows one to completely characterize their training and generalization behavior. No such general understanding exists for quantum neural networks (QNNs), which -- outside of certain special cases -- are known to not behave as Gaussian processes when randomly initialized. We here prove that QNNs and their first two derivatives instead generally form what we call"Wishart processes,"where certain algebraic properties of the network determine the hyperparameters of the process. This Wishart process description allows us to, for the first time: give necessary and sufficient conditions for a QNN architecture to have a Gaussian process limit; calculate the full gradient distribution, generalizing previously known barren plateau results; and calculate the local minima distribution of algebraically constrained QNNs. Our unified framework suggests a certain simple operational definition for the"trainability"of a given QNN model using a newly introduced, experimentally accessible quantity we call the"degrees of freedom"of the network architecture.

Problem

Research questions and friction points this paper is trying to address.

Characterize QNNs as Wishart processes

Determine Gaussian process limits for QNNs

Define QNN trainability via degrees of freedom

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Wishart processes for QNNs

Defines QNN trainability via degrees of freedom

Generalizes gradient and minima distributions

🔎 Similar Papers

No similar papers found.