Takeuchi's Information Criteria as Generalization Measures for DNNs Close to NTK Regime

📅 2026-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of characterizing the generalization capability of deep neural networks (DNNs), which, as singular statistical models, defy accurate description by conventional metrics. The study establishes, for the first time, a theoretical foundation for the validity of the Takeuchi Information Criterion (TIC) in estimating the generalization gap of DNNs within the neural tangent kernel (NTK) regime, while also delineating the boundary conditions under which TIC fails outside this regime. Furthermore, the authors propose a computationally efficient approximation of TIC to enable trial-and-error pruning in hyperparameter optimization. Extensive experiments across 12 architectures and over 5,000 DNN models demonstrate that TIC exhibits strong correlation with the true generalization gap within the NTK regime and significantly outperforms existing pruning methods in hyperparameter tuning.

Technology Category

Application Category

📝 Abstract
Generalization measures have been studied extensively in the machine learning community to better characterize generalization gaps. However, establishing a reliable generalization measure for statistically singular models such as deep neural networks (DNNs) is difficult due to their complex nature. This study focuses on Takeuchi's information criterion (TIC) to investigate the conditions under which this classical measure can effectively explain the generalization gaps of DNNs. Importantly, the developed theory indicates the applicability of TIC near the neural tangent kernel (NTK) regime. In a series of experiments, we trained more than 5,000 DNN models with 12 architectures, including large models (e.g., VGG-16), on four datasets, and estimated the corresponding TIC values to examine the relationship between the generalization gap and the TIC estimates. We applied several TIC approximation methods with feasible computational costs and assessed the accuracy trade-off. Our experimental results indicate that the estimated TIC values correlate well with the generalization gap under conditions close to the NTK regime. However, we show both theoretically and empirically that outside the NTK regime such correlation disappears. Finally, we demonstrate that TIC provides better trial pruning ability than existing methods for hyperparameter optimization.
Problem

Research questions and friction points this paper is trying to address.

generalization measure
deep neural networks
Takeuchi's information criterion
NTK regime
generalization gap
Innovation

Methods, ideas, or system contributions that make the work stand out.

Takeuchi's Information Criterion
Neural Tangent Kernel
Generalization Gap
Deep Neural Networks
Hyperparameter Optimization
🔎 Similar Papers
No similar papers found.