🤖 AI Summary
This work investigates the geometric nature of emergent subnetworks and their biases in deep neural networks. Method: Adopting an algebraic-geometric perspective, we systematically analyze the singular structure of the neuromanifold—the function space parameterized by fully connected networks with polynomial activations. Contribution/Results: We rigorously characterize the intrinsic singularity of low-dimensional subspaces spanned by subnetworks and establish a precise causal link between such singularities and critical points in training dynamics. Our theoretical analysis yields the first exact dimension formula for these subspaces. Furthermore, we demonstrate that this singularity is ubiquitous in fully connected architectures but vanishes naturally in convolutional networks—revealing a fundamental geometric distinction underlying their divergent optimization stability properties. These results provide a novel theoretical framework for understanding how intrinsic geometric constraints shape neural network optimization behavior.
📝 Abstract
Deep neural networks often infer sparse representations, converging to a subnetwork during the learning process. In this work, we theoretically analyze subnetworks and their bias through the lens of algebraic geometry. We consider fully-connected networks with polynomial activation functions, and focus on the geometry of the function space they parametrize, often referred to as neuromanifold. First, we compute the dimension of the subspace of the neuromanifold parametrized by subnetworks. Second, we show that this subspace is singular. Third, we argue that such singularities often correspond to critical points of the training dynamics. Lastly, we discuss convolutional networks, for which subnetworks and singularities are similarly related, but the bias does not arise.