🤖 AI Summary
To address the challenges of characterizing deep learning model complexity and evaluating generalization performance, this paper proposes the two-scale effective dimension (2sED), the first effective dimension measure formulated within a dual-scale framework. Rigorously grounded in statistical learning theory, 2sED establishes a tight theoretical upper bound on generalization error. For Markovian models, we further design a layer-wise iterative algorithm to efficiently approximate its lower bound. Extensive validation—through theoretical analysis and large-scale neural network simulations—on standard benchmarks (CIFAR, ImageNet) and mainstream architectures (ResNet, ViT) demonstrates that 2sED exhibits strong negative correlation with training error, achieves high lower-bound approximation accuracy, and maintains computational efficiency. Our key contributions are: (1) the first theoretical derivation of a tight upper bound linking 2sED to generalization error; (2) a scalable, layer-wise iterative estimation algorithm; and (3) a novel model capacity assessment paradigm that bridges rigorous theoretical guarantees with practical applicability.
📝 Abstract
We introduce a novel capacity measure 2sED for statistical models based on the effective dimension. The new quantity provably bounds the generalization error under mild assumptions on the model. Furthermore, simulations on standard data sets and popular model architectures show that 2sED correlates well with the training error. For Markovian models, we show how to efficiently approximate 2sED from below through a layerwise iterative approach, which allows us to tackle deep learning models with a large number of parameters. Simulation results suggest that the approximation is good for different prominent models and data sets.