Quotient Geometry, Effective Curvature, and Implicit Bias in Simple Shallow Neural Networks

📅 2026-03-22

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

This work addresses the parameter redundancy in over-parameterized shallow neural networks arising from permutation and scaling symmetries of hidden units, which renders Euclidean geometry inadequate for characterizing the intrinsic structure of predictors. The study introduces, for the first time, a systematic differential-geometric framework on the quotient manifold induced by these symmetries. On this symmetry-reduced space, it defines non-degenerate notions of effective curvature and a reduced Hessian, and employs a horizontal/vertical decomposition to separate gradient flow dynamics that affect predictions from those corresponding to irrelevant gauge transformations. By treating predictor equivalence classes as fundamental objects, the approach offers a novel perspective on implicit bias and yields a geometric description better aligned with learning dynamics. Experiments demonstrate that curvature on the quotient space more naturally organizes local dynamics and provides a concise explanation of implicit bias in underdetermined settings.

Technology Category

Application Category

📝 Abstract

Overparameterized shallow neural networks admit substantial parameter redundancy: distinct parameter vectors may represent the same predictor due to hidden-unit permutations, rescalings, and related symmetries. As a result, geometric quantities computed directly in the ambient Euclidean parameter space can reflect artifacts of representation rather than intrinsic properties of the predictor. In this paper, we develop a differential-geometric framework for analyzing simple shallow networks through the quotient space obtained by modding out parameter symmetries on a regular set. We first characterize the symmetry and quotient structure of regular shallow-network parameters and show that the finite-sample realization map induces a natural metric on the quotient manifold. This leads to an effective notion of curvature that removes degeneracy along symmetry orbits and yields a symmetry-reduced Hessian capturing intrinsic local geometry. We then study gradient flows on the quotient and show that only the horizontal component of parameter motion contributes to first-order predictor evolution, while the vertical component corresponds purely to gauge variation. Finally, we formulate an implicit-bias viewpoint at the quotient level, arguing that meaningful complexity should be assigned to predictor classes rather than to individual parameter representatives. Our experiments confirm that ambient flatness is representation-dependent, that local dynamics are better organized by quotient-level curvature summaries, and that in underdetermined regimes, implicit bias is most naturally described in quotient coordinates.

Problem

Research questions and friction points this paper is trying to address.

overparameterization

parameter redundancy

symmetry

quotient geometry

implicit bias

Innovation

Methods, ideas, or system contributions that make the work stand out.

Quotient Geometry

Effective Curvature

Implicit Bias