Embedding principle of homogeneous neural network for classification problem

📅 2025-05-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the KKT solution structure of maximum-margin problems for homogeneous neural networks in classification tasks, and its inheritance under width expansion. How does the KKT solution set—and specifically the maximum-margin direction—evolve as network width increases? Method: We introduce and rigorously prove that KKT points of wider networks, constructed via neuron splitting, can be isometrically linearly embedded into the KKT point set of the original narrower network, preserving the maximum-margin direction. Our analysis integrates KKT condition theory, gradient flow differential equations, and optimization landscape modeling for homogeneous networks. Contribution/Results: We establish, for the first time, a consistent mapping between static KKT solutions and dynamic training trajectories—including the directional structure of ω-limit sets—valid for both two-layer and deep homogeneous networks. Crucially, neuron splitting preserves directional alignment of margin-maximizing solutions under width expansion, revealing an intrinsic inheritance mechanism in the solution space during architectural scaling.

Technology Category

Application Category

📝 Abstract
Understanding the convergence points and optimization landscape of neural networks is crucial, particularly for homogeneous networks where Karush-Kuhn-Tucker (KKT) points of the associated maximum-margin problem often characterize solutions. This paper investigates the relationship between such KKT points across networks of different widths generated via neuron splitting. We introduce and formalize the extbf{KKT point embedding principle}, establishing that KKT points of a homogeneous network's max-margin problem ($P_{Phi}$) can be embedded into the KKT points of a larger network's problem ($P_{ ilde{Phi}}$) via specific linear isometric transformations corresponding to neuron splitting. We rigorously prove this principle holds for neuron splitting in both two-layer and deep homogeneous networks. Furthermore, we connect this static embedding to the dynamics of gradient flow training with smooth losses. We demonstrate that trajectories initiated from appropriately mapped points remain mapped throughout training and that the resulting $omega$-limit sets of directions are correspondingly mapped ($T(L( heta(0))) = L(oldsymbol{eta}(0))$), thereby preserving the alignment with KKT directions dynamically when directional convergence occurs. Our findings offer insights into the effects of network width, parameter redundancy, and the structural connections between solutions found via optimization in homogeneous networks of varying sizes.
Problem

Research questions and friction points this paper is trying to address.

Understand KKT points in homogeneous neural networks
Prove KKT point embedding via neuron splitting
Link static embedding to gradient flow dynamics
Innovation

Methods, ideas, or system contributions that make the work stand out.

KKT point embedding principle for homogeneous networks
Linear isometric transformations via neuron splitting
Dynamic alignment with KKT directions during training
🔎 Similar Papers
Jiahan Zhang
Jiahan Zhang
School of Mathematical Sciences, Shanghai Jiao Tong University
T
Tao Luo
School of Mathematical Sciences, Shanghai Jiao Tong University; Institute of Natural Sciences, MOE-LSC, Shanghai Jiao Tong University, Shanghai, 200240, China; CMA-Shanghai, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
Yaoyu Zhang
Yaoyu Zhang
Shanghai Jiao Tong University
Deep Learning Theory