Unbiased Approximate Vector-Jacobian Products for Efficient Backpropagation

📅 2026-02-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a randomized, unbiased vector-Jacobian product (VJP) approximation method with minimal variance to replace exact computations in backpropagation, aiming to reduce the computational and memory costs of training deep neural networks. The approach achieves theoretically optimal estimation under sparsity constraints and establishes a principled trade-off between approximation accuracy and per-iteration training cost. Empirical evaluations on multilayer perceptrons, BagNets, and Vision Transformers demonstrate that the method substantially lowers training overhead while preserving model accuracy almost entirely.

Technology Category

Application Category

📝 Abstract
In this work we introduce methods to reduce the computational and memory costs of training deep neural networks. Our approach consists in replacing exact vector-jacobian products by randomized, unbiased approximations thereof during backpropagation. We provide a theoretical analysis of the trade-off between the number of epochs needed to achieve a target precision and the cost reduction for each epoch. We then identify specific unbiased estimates of vector-jacobian products for which we establish desirable optimality properties of minimal variance under sparsity constraints. Finally we provide in-depth experiments on multi-layer perceptrons, BagNets and Visual Transfomers architectures. These validate our theoretical results, and confirm the potential of our proposed unbiased randomized backpropagation approach for reducing the cost of deep learning.
Problem

Research questions and friction points this paper is trying to address.

backpropagation
vector-Jacobian products
computational cost
memory efficiency
deep neural networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

unbiased approximation
vector-Jacobian products
randomized backpropagation
variance minimization
sparse gradients
🔎 Similar Papers
No similar papers found.