Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

This work establishes the first theoretical connection between predictive coding (PC), a biologically plausible local learning rule, and the minimum description length (MDL) principle for joint optimization of empirical risk and model complexity. Methodologically, the authors show that deep PC training corresponds to block-coordinate descent on a two-part code objective function. They derive the first generalization bound for PC: $R( heta) leq hat{R}( heta) + L( heta)/N$, where the compression term $L( heta)$ arises from a prefix-code prior; prove that each PC update strictly reduces the two-part code length; and establish convergence guarantees to block-coordinate stationary points. Compared to unconstrained gradient descent, PC yields a tighter generalization bound and provably balances fidelity and parsimony. These results position PC as a theoretically grounded, biologically plausible alternative to backpropagation.

Technology Category

Application Category

📝 Abstract

We present the first theoretical framework that connects predictive coding (PC), a biologically inspired local learning rule, with the minimum description length (MDL) principle in deep networks. We prove that layerwise PC performs block-coordinate descent on the MDL two-part code objective, thereby jointly minimizing empirical risk and model complexity. Using Hoeffding's inequality and a prefix-code prior, we derive a novel generalization bound of the form $R( heta) le ^{R}( heta) + frac{L( heta)}{N}$, capturing the tradeoff between fit and compression. We further prove that each PC sweep monotonically decreases the empirical two-part codelength, yielding tighter high-probability risk bounds than unconstrained gradient descent. Finally, we show that repeated PC updates converge to a block-coordinate stationary point, providing an approximate MDL-optimal solution. To our knowledge, this is the first result offering formal generalization and convergence guarantees for PC-trained deep models, positioning PC as a theoretically grounded and biologically plausible alternative to backpropagation.

Problem

Research questions and friction points this paper is trying to address.

Connects predictive coding with MDL in deep networks

Proves PC minimizes empirical risk and model complexity

Provides generalization guarantees for PC-trained models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Connects predictive coding with MDL principle

Proves PC minimizes empirical risk and complexity

Derives generalization bounds using Hoeffding's inequality

🔎 Similar Papers

No similar papers found.