🤖 AI Summary
Understanding the generalization mechanisms of deep neural networks (DNNs) remains a fundamental challenge in deep learning theory.
Method: This work investigates DNN generalization from an interaction perspective, proposing a “symbolic interaction” modeling framework. It theoretically derives a rigorous decoupling of generalizable versus non-generalizable interactions and establishes an interpretable generalization attribution framework grounded in interaction distribution morphology—revealing that generalizable interactions follow a decaying distribution, whereas non-generalizable ones exhibit a spindle-shaped distribution.
Contribution/Results: Empirical measurements across diverse models and datasets confirm strong alignment between theoretical predictions and actual internal interactions in trained DNNs. The proposed distribution-based criterion successfully isolates generalization-relevant interaction components, validating the universality of the observed distributional规律. This work provides both a novel conceptual lens for understanding DNN generalization and a practical, interpretable analytical tool for diagnosing and improving model generalizability.
📝 Abstract
This paper aims to analyze the generalization power of deep neural networks (DNNs) from the perspective of interactions. Unlike previous analysis of a DNN's generalization power in a highdimensional feature space, we find that the generalization power of a DNN can be explained as the generalization power of the interactions. We found that the generalizable interactions follow a decay-shaped distribution, while non-generalizable interactions follow a spindle-shaped distribution. Furthermore, our theory can effectively disentangle these two types of interactions from a DNN. We have verified that our theory can well match real interactions in a DNN in experiments.