🤖 AI Summary
To address the bottlenecks in deep learning—namely, excessive training iterations and heavy reliance on loss-function design and learning-rate hyperparameter tuning—this paper proposes the Expectation Reflection (ER) method. ER departs from conventional gradient-descent paradigms by introducing a multiplicative weight update mechanism grounded solely in output ratios, eliminating the need for explicit loss functions or learning-rate hyperparameters. Crucially, ER converges to the theoretically optimal weights within a single forward–reflection iteration. Its key contributions are threefold: (1) it achieves, for the first time, provably optimal convergence in one iteration; (2) it reformulates target propagation’s inverse mapping as a differentiable reflection operation, seamlessly integrating it into the optimization framework; and (3) it presents the first multiplicative optimization algorithm applicable to multilayer neural networks. Empirical evaluation on image classification tasks demonstrates that ER substantially accelerates training, entirely obviates hyperparameter tuning, and surpasses standard backpropagation in convergence efficiency.
📝 Abstract
Efficient training of artificial neural networks remains a key challenge in deep learning. Backpropagation (BP), the standard learning algorithm, relies on gradient descent and typically requires numerous iterations for convergence. In this study, we introduce Expectation Reflection (ER), a novel learning approach that updates weights multiplicatively based on the ratio of observed to predicted outputs. Unlike traditional methods, ER maintains consistency without requiring ad hoc loss functions or learning rate hyperparameters. We extend ER to multilayer networks and demonstrate its effectiveness in performing image classification tasks. Notably, ER achieves optimal weight updates in a single iteration. Additionally, we reinterpret ER as a modified form of gradient descent incorporating the inverse mapping of target propagation. These findings suggest that ER provides an efficient and scalable alternative for training neural networks.