Redundancy as a Structural Information Principle for Learning and Generalization

📅 2025-10-12

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper challenges the classical information-theoretic view that redundancy is inherently inefficient, arguing instead that redundancy constitutes a fundamental structural principle in finite, structured learning systems. Method: We develop a geometric framework unifying mutual information, chi-square dependence, and spectral redundancy, and—starting from the family of f-divergences—formally derive tight upper and lower bounds on redundancy for the first time. Contribution/Results: We prove these bounds define an optimal trade-off point in learning: maximizing generalization by balancing against both over-compression (loss of structural fidelity) and over-coupling (breakdown of stability). Theoretical analysis is validated empirically via masked autoencoder experiments, which demonstrate that generalization performance peaks at a specific, measurable redundancy level. This confirms redundancy’s quantifiability, tunability, and pivotal role as a bridge between information theory and practical machine learning.

Technology Category

Application Category

📝 Abstract

We present a theoretical framework that extends classical information theory to finite and structured systems by redefining redundancy as a fundamental property of information organization rather than inefficiency. In this framework, redundancy is expressed as a general family of informational divergences that unifies multiple classical measures, such as mutual information, chi-squared dependence, and spectral redundancy, under a single geometric principle. This reveals that these traditional quantities are not isolated heuristics but projections of a shared redundancy geometry. The theory further predicts that redundancy is bounded both above and below, giving rise to an optimal equilibrium that balances over-compression (loss of structure) and over-coupling (collapse). While classical communication theory favors minimal redundancy for transmission efficiency, finite and structured systems, such as those underlying real-world learning, achieve maximal stability and generalization near this equilibrium. Experiments with masked autoencoders are used to illustrate and verify this principle: the model exhibits a stable redundancy level where generalization peaks. Together, these results establish redundancy as a measurable and tunable quantity that bridges the asymptotic world of communication and the finite world of learning.

Problem

Research questions and friction points this paper is trying to address.

Redefining redundancy as structural information principle for learning

Establishing redundancy bounds that balance compression and coupling

Demonstrating redundancy optimizes generalization in finite structured systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Redefining redundancy as structural information organization principle

Unifying classical measures under shared redundancy geometry framework

Predicting bounded redundancy equilibrium for optimal generalization

🔎 Similar Papers

Foundation Models on a Budget: Approximating Blocks in Large Vision Models

2024-10-07Citations: 0

Authors to Follow