Geometric Inductive Biases of Deep Networks: The Role of Data and Architecture

📅 2024-10-15
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the direction sensitivity of deep neural network generalization by proposing the Geometric Invariance Hypothesis (GIH): during training, the input-space curvature remains invariant along architecture-dependent directions. To formalize this, we define the “average geometry” of the network and derive its evolution dynamics, revealing dominance by the data covariance projected onto the architecture-specific subspace. Using high-dimensional planar embeddings for nonlinear binary classification, input-output geometric quantification, and covariance projection analysis, we empirically validate GIH across ResNet and MLP architectures. We find that ResNets exhibit poor directional robustness due to low-rank average geometry, establishing a quantitative link between average geometry rank and generalization stability. Our results provide a geometric framework for understanding architecture–data co-adaptation and inform interpretable model selection.

Technology Category

Application Category

📝 Abstract
In this paper, we propose the $ extit{geometric invariance hypothesis (GIH)}$, which argues that the input space curvature of a neural network remains invariant under transformation in certain architecture-dependent directions during training. We investigate a simple, non-linear binary classification problem residing on a plane in a high dimensional space and observe that$unicode{x2014}$unlike MPLs$unicode{x2014}$ResNets fail to generalize depending on the orientation of the plane. Motivated by this example, we define a neural network's $ extbf{average geometry}$ and $ extbf{average geometry evolution}$ as compact $ extit{architecture-dependent}$ summaries of the model's input-output geometry and its evolution during training. By investigating the average geometry evolution at initialization, we discover that the geometry of a neural network evolves according to the data covariance projected onto its average geometry. This means that the geometry only changes in a subset of the input space when the average geometry is low-rank, such as in ResNets. This causes an architecture-dependent invariance property in the input space curvature, which we dub GIH. Finally, we present extensive experimental results to observe the consequences of GIH and how it relates to generalization in neural networks.
Problem

Research questions and friction points this paper is trying to address.

Explores invariance in neural network input space curvature.
Investigates generalization differences in ResNets and MLPs.
Analyzes architecture-dependent geometry evolution during training.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes geometric invariance hypothesis (GIH)
Defines average geometry and its evolution
Links geometry evolution to data covariance
🔎 Similar Papers
No similar papers found.