OUI Need to Talk About Weight Decay: A New Perspective on Overfitting Detection

📅 2025-04-24

📈 Citations: 0

✨ Influential: 0

career value

156K/year

🤖 AI Summary

Existing deep neural network training relies heavily on validation sets to detect overfitting/underfitting and tune regularization hyperparameters—e.g., weight decay (WD)—introducing computational overhead and validation-set dependency. To address this, we propose the Overfitting–Underfitting Indicator (OUI), defined as the ratio of gradient norm to parameter norm during training. Grounded in gradient analysis and the intrinsic relationship between gradient dynamics and L2 regularization, OUI enables early, stable, and validation-free assessment of generalization behavior—often before loss or accuracy convergence—and supports dynamic monitoring and adaptive WD selection. Experiments on CIFAR-100, TinyImageNet, and ImageNet-1K demonstrate that OUI stabilizes rapidly in early training stages and consistently improves final generalization performance. Across multiple architectures—including ResNet, DenseNet, and EfficientNet—OUI-guided WD optimization reduces hyperparameter search time by over 60% while achieving higher validation accuracy than conventional validation-set-based tuning.

Technology Category

Application Category

📝 Abstract

We introduce the Overfitting-Underfitting Indicator (OUI), a novel tool for monitoring the training dynamics of Deep Neural Networks (DNNs) and identifying optimal regularization hyperparameters. Specifically, we validate that OUI can effectively guide the selection of the Weight Decay (WD) hyperparameter by indicating whether a model is overfitting or underfitting during training without requiring validation data. Through experiments on DenseNet-BC-100 with CIFAR- 100, EfficientNet-B0 with TinyImageNet and ResNet-34 with ImageNet-1K, we show that maintaining OUI within a prescribed interval correlates strongly with improved generalization and validation scores. Notably, OUI converges significantly faster than traditional metrics such as loss or accuracy, enabling practitioners to identify optimal WD (hyperparameter) values within the early stages of training. By leveraging OUI as a reliable indicator, we can determine early in training whether the chosen WD value leads the model to underfit the training data, overfit, or strike a well-balanced trade-off that maximizes validation scores. This enables more precise WD tuning for optimal performance on the tested datasets and DNNs. All code for reproducing these experiments is available at https://github.com/AlbertoFdezHdez/OUI.

Problem

Research questions and friction points this paper is trying to address.

Detects overfitting and underfitting in DNNs without validation data

Guides optimal weight decay hyperparameter selection for better generalization

Provides faster convergence than traditional metrics like loss or accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

OUI monitors DNN training dynamics without validation data

OUI guides Weight Decay hyperparameter selection effectively

OUI converges faster than traditional metrics like loss

🔎 Similar Papers

No similar papers found.