A Two-Scale Complexity Measure for Deep Learning Models

📅 2024-01-17
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of characterizing deep learning model complexity and evaluating generalization performance, this paper proposes the two-scale effective dimension (2sED), the first effective dimension measure formulated within a dual-scale framework. Rigorously grounded in statistical learning theory, 2sED establishes a tight theoretical upper bound on generalization error. For Markovian models, we further design a layer-wise iterative algorithm to efficiently approximate its lower bound. Extensive validation—through theoretical analysis and large-scale neural network simulations—on standard benchmarks (CIFAR, ImageNet) and mainstream architectures (ResNet, ViT) demonstrates that 2sED exhibits strong negative correlation with training error, achieves high lower-bound approximation accuracy, and maintains computational efficiency. Our key contributions are: (1) the first theoretical derivation of a tight upper bound linking 2sED to generalization error; (2) a scalable, layer-wise iterative estimation algorithm; and (3) a novel model capacity assessment paradigm that bridges rigorous theoretical guarantees with practical applicability.

Technology Category

Application Category

📝 Abstract
We introduce a novel capacity measure 2sED for statistical models based on the effective dimension. The new quantity provably bounds the generalization error under mild assumptions on the model. Furthermore, simulations on standard data sets and popular model architectures show that 2sED correlates well with the training error. For Markovian models, we show how to efficiently approximate 2sED from below through a layerwise iterative approach, which allows us to tackle deep learning models with a large number of parameters. Simulation results suggest that the approximation is good for different prominent models and data sets.
Problem

Research questions and friction points this paper is trying to address.

Deep Learning Model Complexity
Performance Evaluation
New Data
Innovation

Methods, ideas, or system contributions that make the work stand out.

2sED
Deep Learning Performance Evaluation
Layer-wise Approximation
🔎 Similar Papers
No similar papers found.
M
Massimiliano Datres
Department of Mathematics, University of Trento, Trento; DSH, Bruno Kessler Fondation, Trento
G
G. P. Leonardi
Department of Mathematics, University of Trento, Trento
Alessio Figalli
Alessio Figalli
ETH Zurich
Calculus of VariationsPartial Differential Equations
David Sutter
David Sutter
IBM Research
Quantum Information TheoryQuantum Computation