🤖 AI Summary
This work addresses the unclear internal mechanisms underlying the failure of deep neural networks to generalize on unseen samples, particularly the lack of effective characterization of shifts in internal decision logic. The authors propose a novel perspective termed “Decision Pattern Shift” (DPS), which for the first time links generalization performance to the stability of internal decision processes. Specifically, they construct channel contribution vectors via GradCAM to represent the decision logic of individual samples and quantify generalization failure by measuring the deviation of these vectors from class-wise average patterns. The proposed framework offers a unified explanation for diverse generalization degradation scenarios, enabling early risk detection and precise defect localization. Experiments demonstrate that DPS exhibits a strong linear correlation with generalization gap (Pearson r > 0.8) and organizes various degradation cases into a continuous spectrum, revealing a systematic drift mechanism behind generalization failure.
📝 Abstract
Understanding why deep neural networks (DNNs) fail to generalize to unseen samples remains a long-standing challenge. Existing studies mainly examine changes in externally observable factors such as data, representations, or outputs, yet offer limited insight into how a model's internal decision mechanism evolves from training to test. To address this gap, we introduce Decision Pattern Shift (DPS), a new perspective that defines generalization through the stability of internal decision patterns and quantifies failure as their deviation from those learned during training. Specifically, we represent each sample's decision pattern as a GradCAM-based channel-contribution vector, which captures how feature channels collectively support a prediction, and we propose the DPS metric to measure its discrepancy from the class-average pattern. Empirical analyses across multiple datasets and architectures show that, (i) decision patterns form a highly structured, class-consistent space with strong intra-class cohesion and low inter-class confusion, enabling direct analysis of a model's decision logic; (ii) the DPS magnitude correlates linearly with the generalization gap (nearly all Pearson r > 0.8), revealing generalization as a systematic drift in the model's internal decision mechanism; (iii) the DPS spectrum organizes diverse generalization degradation scenarios (covering ideal generalization, in-distribution degradation, domain shift, out-of-distribution, and shortcut learning) into a continuous trajectory, providing a unified explanation of their failure modes. These findings open up new possibilities for early generalization-risk detection, failure-mode diagnosis, and channel-level defect localization.