🤖 AI Summary
Deep predictive coding (PC) networks suffer from exponential gradient decay with depth, leading to training failure in deep architectures. To address this, we propose the Error Optimization (EO) framework, which reformulates optimization in terms of prediction errors—treating them as the fundamental optimization variables—and introduces error-space reparameterization to enable lossless backward propagation of error signals to all layers. This preserves the theoretical completeness of PC while eliminating signal attenuation entirely. EO achieves, for the first time, stable convergence in deep PC networks, accelerating training by over two orders of magnitude and supporting arbitrary depth scaling. Evaluated across multiple network architectures and standard benchmarks, EO matches backpropagation in accuracy while substantially outperforming conventional PC methods—thereby breaking the long-standing depth barrier hindering the practical deployment of PC models.
📝 Abstract
Predictive Coding (PC) offers a biologically plausible alternative to backpropagation for neural network training, yet struggles with deeper architectures. This paper identifies the root cause: an inherent signal decay problem where gradients attenuate exponentially with depth, becoming computationally negligible due to numerical precision constraints. To address this fundamental limitation, we introduce Error Optimization (EO), a novel reparameterization that preserves PC's theoretical properties while eliminating signal decay. By optimizing over prediction errors rather than states, EO enables signals to reach all layers simultaneously and without attenuation, converging orders of magnitude faster than standard PC. Experiments across multiple architectures and datasets demonstrate that EO matches backpropagation's performance even for deeper models where conventional PC struggles. Besides practical improvements, our work provides theoretical insight into PC dynamics and establishes a foundation for scaling biologically-inspired learning to deeper architectures on digital hardware and beyond.