🤖 AI Summary
Gradient optimization in deep neural networks (DNNs) often disrupts weight structure, degrading feature clarity and impairing learning dynamics. To address this, we propose Eigen Neural Networks (ENN), the first architecture to introduce a learnable, layer-shared orthogonal feature basis for weight reparameterization—enabling weight decorrelation and dynamic alignment at the representation level, thereby mitigating intrinsic limitations of backpropagation (BP). Building upon ENN, we further design ENN-ℓ, a parallelized local learning algorithm that eliminates the need for global BP. Experiments demonstrate that ENN achieves state-of-the-art performance on ImageNet classification, sets new benchmarks in cross-modal image–text retrieval, and accelerates training by over 2× while attaining higher accuracy.
📝 Abstract
The remarkable success of Deep Neural Networks(DNN) is driven by gradient-based optimization, yet this process is often undermined by its tendency to produce disordered weight structures, which harms feature clarity and degrades learning dynamics. To address this fundamental representational flaw, we introduced the Eigen Neural Network (ENN), a novel architecture that reparameterizes each layer's weights in a layer-shared, learned orthonormal eigenbasis. This design enforces decorrelated, well-aligned weight dynamics axiomatically, rather than through regularization, leading to more structured and discriminative feature representations. When integrated with standard BP, ENN consistently outperforms state-of-the-art methods on large-scale image classification benchmarks, including ImageNet, and its superior representations generalize to set a new benchmark in cross-modal image-text retrieval. Furthermore, ENN's principled structure enables a highly efficient, backpropagation-free(BP-free) local learning variant, ENN-$ell$. This variant not only resolves BP's procedural bottlenecks to achieve over 2$ imes$ training speedup via parallelism, but also, remarkably, surpasses the accuracy of end-to-end backpropagation. ENN thus presents a new architectural paradigm that directly remedies the representational deficiencies of BP, leading to enhanced performance and enabling a more efficient, parallelizable training regime.