🤖 AI Summary
This study investigates the connectivity and singularities of the parameter space in feedforward ReLU networks, revealing that these properties are governed by an algebraic variety structure induced by the homogeneity of ReLU activations. By integrating tools from algebraic geometry, graph theory, and dynamical systems, the work provides the first complete characterization of necessary and sufficient conditions for parameter space connectivity in networks with directed acyclic graph (DAG) architectures, highlighting the critical roles of bottleneck nodes and balance conditions. Theoretically, it establishes that singularities are determined solely by network topology, are reachable during optimization, and are intrinsically linked to differentiable pruning. Numerical experiments corroborate these theoretical predictions, demonstrating the significant impact of singularities on optimization trajectories and model compression.
📝 Abstract
Understanding the properties of the parameter space in feed-forward ReLU networks is critical for effectively analyzing and guiding training dynamics. After initialization, training under gradient flow decisively restricts the parameter space to an algebraic variety that emerges from the homogeneous nature of the ReLU activation function. In this study, we examine two key challenges associated with feed-forward ReLU networks built on general directed acyclic graph (DAG) architectures: the (dis)connectedness of the parameter space and the existence of singularities within it. We extend previous results by providing a thorough characterization of connectedness, highlighting the roles of bottleneck nodes and balance conditions associated with specific subsets of the network. Our findings clearly demonstrate that singularities are intricately connected to the topology of the underlying DAG and its induced sub-networks. We discuss the reachability of these singularities and establish a principled connection with differentiable pruning. We validate our theory with simple numerical experiments.