On the Sample Complexity of One Hidden Layer Networks with Equivariance, Locality and Weight Sharing

๐Ÿ“… 2024-11-21
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work investigates how equivariance, locality, and weight sharing affect the sample complexity of single-layer neural networks for position-aware learning in image modeling. Building on statistical learning theory, we derive dimension-free upper and lower bounds on generalization errorโ€”separately quantifying the impact of each property. First, we prove that non-equivariant weight sharing achieves generalization performance comparable to equivariant architectures. Second, we precisely quantify the generalization benefit conferred by locality and reveal its fundamental trade-off with expressive capacity. Third, we extend the analysis to architectures incorporating max-pooling and multiple layers, yielding activation-function-agnostic, tight bounds. Our theoretical results are rigorously established via Rademacher complexity, filter norm constraints, and localized modeling. Empirical validation confirms that locality significantly improves generalization, albeit subject to inherent expressivity limitations.

Technology Category

Application Category

๐Ÿ“ Abstract
Weight sharing, equivariance, and local filters, as in convolutional neural networks, are believed to contribute to the sample efficiency of neural networks. However, it is not clear how each one of these design choices contributes to the generalization error. Through the lens of statistical learning theory, we aim to provide insight into this question by characterizing the relative impact of each choice on the sample complexity. We obtain lower and upper sample complexity bounds for a class of single hidden layer networks. For a large class of activation functions, the bounds depend merely on the norm of filters and are dimension-independent. We also provide bounds for max-pooling and an extension to multi-layer networks, both with mild dimension dependence. We provide a few takeaways from the theoretical results. It can be shown that depending on the weight-sharing mechanism, the non-equivariant weight-sharing can yield a similar generalization bound as the equivariant one. We show that locality has generalization benefits, however the uncertainty principle implies a trade-off between locality and expressivity. We conduct extensive experiments and highlight some consistent trends for these models.
Problem

Research questions and friction points this paper is trying to address.

Sample Complexity
Invariant Properties
Neural Network Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Statistical Learning Theory
Weight Sharing
Locality in Neural Networks
๐Ÿ”Ž Similar Papers
No similar papers found.