A Two-Scale Complexity Measure for Deep Learning Models

📅 2024-01-17

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

251K/year

🤖 AI Summary

To address the challenges of characterizing deep learning model complexity and evaluating generalization performance, this paper proposes the two-scale effective dimension (2sED), the first effective dimension measure formulated within a dual-scale framework. Rigorously grounded in statistical learning theory, 2sED establishes a tight theoretical upper bound on generalization error. For Markovian models, we further design a layer-wise iterative algorithm to efficiently approximate its lower bound. Extensive validation—through theoretical analysis and large-scale neural network simulations—on standard benchmarks (CIFAR, ImageNet) and mainstream architectures (ResNet, ViT) demonstrates that 2sED exhibits strong negative correlation with training error, achieves high lower-bound approximation accuracy, and maintains computational efficiency. Our key contributions are: (1) the first theoretical derivation of a tight upper bound linking 2sED to generalization error; (2) a scalable, layer-wise iterative estimation algorithm; and (3) a novel model capacity assessment paradigm that bridges rigorous theoretical guarantees with practical applicability.

Technology Category

Application Category

📝 Abstract

We introduce a novel capacity measure 2sED for statistical models based on the effective dimension. The new quantity provably bounds the generalization error under mild assumptions on the model. Furthermore, simulations on standard data sets and popular model architectures show that 2sED correlates well with the training error. For Markovian models, we show how to efficiently approximate 2sED from below through a layerwise iterative approach, which allows us to tackle deep learning models with a large number of parameters. Simulation results suggest that the approximation is good for different prominent models and data sets.

Problem

Research questions and friction points this paper is trying to address.

Deep Learning Model Complexity

Performance Evaluation

New Data

Innovation

Methods, ideas, or system contributions that make the work stand out.

2sED

Deep Learning Performance Evaluation

Layer-wise Approximation

🔎 Similar Papers

No similar papers found.