On Rademacher Complexity-based Generalization Bounds for Deep Learning

📅 2022-08-08
🏛️ arXiv.org
📈 Citations: 11
Influential: 0
📄 PDF
🤖 AI Summary
Classical Rademacher complexity-based generalization bounds for CNNs often yield empty sets for small-category image classification tasks, limiting their theoretical and practical utility. Method: This work is the first to extend the Rademacher complexity framework effectively to deep convolutional neural networks equipped with generalized Lipschitz activation functions—not restricted to ReLU—by introducing a novel vector-space contraction lemma that refines the classical Talagrand contraction inequality. Contribution/Results: The proposed lemma enables derivation of non-vacuous, tight generalization upper bounds. Theoretical analysis integrates Rademacher complexity, Lipschitz continuity, and intrinsic CNN architectural properties, yielding bounds strictly tighter than those of Golowich et al. for ReLU-DNNs. Empirical validation on small-category benchmarks—including CIFAR-10—confirms both the non-vacuity and practical relevance of the derived bounds, substantially broadening the applicability of Rademacher-based generalization analysis to modern CNN architectures with diverse activation functions.
📝 Abstract
We show that the Rademacher complexity-based approach can generate non-vacuous generalisation bounds on Convolutional Neural Networks (CNNs) for classifying a small number of classes of images. The development of new contraction lemmas for high-dimensional mappings between vector spaces for general Lipschitz activation functions is a key technical contribution. These lemmas extend and improve the Talagrand contraction lemma in a variety of cases. Our generalisation bound can improve Golowich et al. for ReLU DNNs. Furthermore, while prior works that use the Rademacher complexity-based approach primarily focus on ReLU DNNs, our results extend to a broader class of activation functions.
Problem

Research questions and friction points this paper is trying to address.

Non-vacuous generalization bounds for CNNs
New contraction lemmas for high-dimensional mappings
Extension to broader class of activation functions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Rademacher complexity-based generalization bounds
New contraction lemmas for high-dimensional mappings
Extends to broader class of activation functions
🔎 Similar Papers
No similar papers found.
L
Lan V. Truong
School of Mathematics, Statistics and Actuarial Science, University of Essex