Efficient Deep Learning with Decorrelated Backpropagation

📅 2024-05-03
🏛️ arXiv.org
📈 Citations: 4
Influential: 1
📄 PDF
🤖 AI Summary
To address slow convergence, high computational overhead, and the difficulty of balancing accuracy and efficiency caused by input correlations in large-scale deep neural network (DNN) training, this paper proposes Decorrelated Backpropagation (DeCorrBP)—a lightweight, end-to-end input decorrelation method that requires no architectural modifications or additional parameters. Its core components include a gradient covariance suppression mechanism, a computationally efficient inter-layer whitening approximation, and seamless integration into standard training pipelines. Experiments on ResNets up to 50 layers demonstrate over 2× training speedup, improved test accuracy, and substantial reductions in GPU-hours and carbon footprint. To our knowledge, DeCorrBP is the first practical, stable, and efficient end-to-end input decorrelation method applicable to large-scale DNNs.

Technology Category

Application Category

📝 Abstract
The backpropagation algorithm remains the dominant and most successful method for training deep neural networks (DNNs). At the same time, training DNNs at scale comes at a significant computational cost and therefore a high carbon footprint. Converging evidence suggests that input decorrelation may speed up deep learning. However, to date, this has not yet translated into substantial improvements in training efficiency in large-scale DNNs. This is mainly caused by the challenge of enforcing fast and stable network-wide decorrelation. Here, we show for the first time that much more efficient training of deep convolutional neural networks is feasible by embracing decorrelated backpropagation as a mechanism for learning. To achieve this goal we made use of a novel algorithm which induces network-wide input decorrelation using minimal computational overhead. By combining this algorithm with careful optimizations, we achieve a more than two-fold speed-up and higher test accuracy compared to backpropagation when training several deep networks up to a 50-layer ResNet model. This demonstrates that decorrelation provides exciting prospects for efficient deep learning at scale.
Problem

Research questions and friction points this paper is trying to address.

Speeding up deep learning via input decorrelation
Reducing computational cost in large-scale DNN training
Enhancing training efficiency and accuracy in deep networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decorrelated backpropagation enhances deep learning efficiency
Novel algorithm enforces network-wide input decorrelation
Combined optimizations yield faster training and accuracy
🔎 Similar Papers
No similar papers found.
S
Sander Dalm
Department of Machine Learning and Neural Computing, Donders Institute for Brain, Cognition and Behaviour, Thomas van Aquinostraat 4, Nijmegen, 6525GD, the Netherlands
J
Joshua Offergeld
Department of Machine Learning and Neural Computing, Donders Institute for Brain, Cognition and Behaviour, Thomas van Aquinostraat 4, Nijmegen, 6525GD, the Netherlands
Nasir Ahmad
Nasir Ahmad
Donders Institute for Brain, Cognition, and Behaviour
Machine LearningComputational NeuroscienceTheoretical NeuroscienceSynaptic PlasticityNeural Networks
M
M. Gerven
Department of Machine Learning and Neural Computing, Donders Institute for Brain, Cognition and Behaviour, Thomas van Aquinostraat 4, Nijmegen, 6525GD, the Netherlands