The Interaction Bottleneck of Deep Neural Networks: Discovery, Proof, and Modulation

📅 2025-12-21

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This paper investigates how deep neural networks (DNNs) encode variable interactions under varying contextual complexity, revealing a pervasive “interaction-order bottleneck”: DNNs readily learn low- and high-order interactions but systematically suppress mid-order ones. Method: We propose a multi-order interaction quantification framework, supported by gradient-variance-based theoretical analysis and rigorous proof, to mechanistically explain the bottleneck’s origin; further, we design an interaction-order-aware loss function enabling controllable modulation of the interaction-order distribution. Contribution/Results: Experiments confirm the bottleneck’s ubiquity across CNNs, Transformers, and diverse image and NLP tasks. By tuning interaction-order distributions, models gain customizable capabilities—enhancing generalization robustness via strengthened low-order interactions or improving structural modeling via amplified high-order interactions. Our work establishes interaction order as a novel paradigm for interpreting and steering deep representations.

Technology Category

Application Category

📝 Abstract

Understanding what kinds of cooperative structures deep neural networks (DNNs) can represent remains a fundamental yet insufficiently understood problem. In this work, we treat interactions as the fundamental units of such structure and investigate a largely unexplored question: how DNNs encode interactions under different levels of contextual complexity, and how these microscopic interaction patterns shape macroscopic representation capacity. To quantify this complexity, we use multi-order interactions [57], where each order reflects the amount of contextual information required to evaluate the joint interaction utility of a variable pair. This formulation enables a stratified analysis of cooperative patterns learned by DNNs. Building on this formulation, we develop a comprehensive study of interaction structure in DNNs. (i) We empirically discover a universal interaction bottleneck: across architectures and tasks, DNNs easily learn low-order and high-order interactions but consistently under-represent mid-order ones. (ii) We theoretically explain this bottleneck by proving that mid-order interactions incur the highest contextual variability, yielding large gradient variance and making them intrinsically difficult to learn. (iii) We further modulate the bottleneck by introducing losses that steer models toward emphasizing interactions of selected orders. Finally, we connect microscopic interaction structures with macroscopic representational behavior: low-order-emphasized models exhibit stronger generalization and robustness, whereas high-order-emphasized models demonstrate greater structural modeling and fitting capability. Together, these results uncover an inherent representational bias in modern DNNs and establish interaction order as a powerful lens for interpreting and guiding deep representations.

Problem

Research questions and friction points this paper is trying to address.

Discovering universal interaction bottleneck in DNNs across tasks

Explaining mid-order interactions' high contextual variability theoretically

Modulating bottleneck via losses to steer interaction order emphasis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-order interactions quantify contextual complexity in DNNs

Loss modulation steers models to emphasize selected interaction orders

Interaction bottleneck explains under-representation of mid-order patterns

🔎 Similar Papers

No similar papers found.