Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning

📅 2025-05-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work establishes the first theoretical connection between predictive coding (PC), a biologically plausible local learning rule, and the minimum description length (MDL) principle for joint optimization of empirical risk and model complexity. Methodologically, the authors show that deep PC training corresponds to block-coordinate descent on a two-part code objective function. They derive the first generalization bound for PC: $R( heta) leq hat{R}( heta) + L( heta)/N$, where the compression term $L( heta)$ arises from a prefix-code prior; prove that each PC update strictly reduces the two-part code length; and establish convergence guarantees to block-coordinate stationary points. Compared to unconstrained gradient descent, PC yields a tighter generalization bound and provably balances fidelity and parsimony. These results position PC as a theoretically grounded, biologically plausible alternative to backpropagation.

Technology Category

Application Category

📝 Abstract
We present the first theoretical framework that connects predictive coding (PC), a biologically inspired local learning rule, with the minimum description length (MDL) principle in deep networks. We prove that layerwise PC performs block-coordinate descent on the MDL two-part code objective, thereby jointly minimizing empirical risk and model complexity. Using Hoeffding's inequality and a prefix-code prior, we derive a novel generalization bound of the form $R( heta) le ^{R}( heta) + frac{L( heta)}{N}$, capturing the tradeoff between fit and compression. We further prove that each PC sweep monotonically decreases the empirical two-part codelength, yielding tighter high-probability risk bounds than unconstrained gradient descent. Finally, we show that repeated PC updates converge to a block-coordinate stationary point, providing an approximate MDL-optimal solution. To our knowledge, this is the first result offering formal generalization and convergence guarantees for PC-trained deep models, positioning PC as a theoretically grounded and biologically plausible alternative to backpropagation.
Problem

Research questions and friction points this paper is trying to address.

Connects predictive coding with MDL in deep networks
Proves PC minimizes empirical risk and model complexity
Provides generalization guarantees for PC-trained models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Connects predictive coding with MDL principle
Proves PC minimizes empirical risk and complexity
Derives generalization bounds using Hoeffding's inequality
🔎 Similar Papers
No similar papers found.
B
Benjamin Prada
Bellini College of AI, Cybersecurity and Computing, University of South Florida, Tampa, FL 33617
S
Shion Matsumoto
Bellini College of AI, Cybersecurity and Computing, University of South Florida, Tampa, FL 33617
A
Abdul Malik Zekri
Bellini College of AI, Cybersecurity and Computing, University of South Florida, Tampa, FL 33617
Ankur Mali
Ankur Mali
Assistant Professor, University of South Florida
Formal languageMemory NetworksPredictive CodingNatural Language Processinglifelong machine