🤖 AI Summary
Classical Rademacher complexity-based generalization bounds for CNNs often yield empty sets for small-category image classification tasks, limiting their theoretical and practical utility.
Method: This work is the first to extend the Rademacher complexity framework effectively to deep convolutional neural networks equipped with generalized Lipschitz activation functions—not restricted to ReLU—by introducing a novel vector-space contraction lemma that refines the classical Talagrand contraction inequality.
Contribution/Results: The proposed lemma enables derivation of non-vacuous, tight generalization upper bounds. Theoretical analysis integrates Rademacher complexity, Lipschitz continuity, and intrinsic CNN architectural properties, yielding bounds strictly tighter than those of Golowich et al. for ReLU-DNNs. Empirical validation on small-category benchmarks—including CIFAR-10—confirms both the non-vacuity and practical relevance of the derived bounds, substantially broadening the applicability of Rademacher-based generalization analysis to modern CNN architectures with diverse activation functions.
📝 Abstract
We show that the Rademacher complexity-based approach can generate non-vacuous generalisation bounds on Convolutional Neural Networks (CNNs) for classifying a small number of classes of images. The development of new contraction lemmas for high-dimensional mappings between vector spaces for general Lipschitz activation functions is a key technical contribution. These lemmas extend and improve the Talagrand contraction lemma in a variety of cases. Our generalisation bound can improve Golowich et al. for ReLU DNNs. Furthermore, while prior works that use the Rademacher complexity-based approach primarily focus on ReLU DNNs, our results extend to a broader class of activation functions.