🤖 AI Summary
This work investigates the dynamical behavior of homogeneous neural networks under gradient flow training with small initialization, specifically focusing on the geometric properties encountered upon first escaping the origin and the preservation mechanism of weight sparsity.
Method: Integrating homogeneous function theory, nonlinear dynamical systems analysis, and locally Lipschitz gradient modeling, we analyze the post-escape trajectory of gradient flow.
Contribution/Results: We establish, for the first time, the existence of a mandatory saddle point along the escape path and rigorously characterize its local stable/unstable manifolds. Under broad conditions, we prove that initial weight sparsity patterns persist throughout the escape phase and remain intact until reaching the next saddle point. Furthermore, we develop the first precise geometric framework for characterizing saddle points in the post-escape regime of homogeneous networks. These results provide a theoretical foundation for understanding structural evolution and critical-point traversal in deep network optimization trajectories.
📝 Abstract
Recent works exploring the training dynamics of homogeneous neural network weights under gradient flow with small initialization have established that in the early stages of training, the weights remain small and near the origin, but converge in direction. Building on this, the current paper studies the gradient flow dynamics of homogeneous neural networks with locally Lipschitz gradients, after they escape the origin. Insights gained from this analysis are used to characterize the first saddle point encountered by gradient flow after escaping the origin. Also, it is shown that for homogeneous feed-forward neural networks, under certain conditions, the sparsity structure emerging among the weights before the escape is preserved after escaping the origin and until reaching the next saddle point.