🤖 AI Summary
Existing theories of neural network generalization lack a universal, differentiable, and distribution-aware measure of "simplicity." This work proposes a low-dimensional function approximation method based on data-dependent interpolation paths and orthogonal polynomial bases to approximate model predictions, introducing—for the first time—the effective polynomial degree as a simplicity metric. This metric is not only differentiable and consistently effective across tasks but also significantly outperforms existing proxy measures such as sharpness. Building upon this insight, the authors design a simplicity-aware regularizer that consistently enhances generalization performance across diverse settings, including image and text classification, vision-language model fine-tuning, and reinforcement learning.
📝 Abstract
Deep networks often exhibit a preference for "simple" solutions, and such a simplicity bias is widely believed to play a key role in generalization. Yet a broadly applicable, quantitative measure of simplicity remains elusive. We introduce polynomial representations as a distribution-aware, low-dimensional surrogate for neural functions: we approximate a network's predictive behavior along data-dependent interpolation paths using orthogonal polynomial bases, yielding a compact functional representation. We show that the effective degree of this representation serves as a practical simplicity metric that is predictive of generalization across tasks and architectures, and consistently outperforms existing generalization proxies such as sharpness. Finally, polynomial representations naturally yield a differentiable simplicity regularizer, which consistently improves generalization in image and text classification, fine-tuning contrastive vision-language models, and reinforcement learning.