The Interaction Bottleneck of Deep Neural Networks: Discovery, Proof, and Modulation

📅 2025-12-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates how deep neural networks (DNNs) encode variable interactions under varying contextual complexity, revealing a pervasive “interaction-order bottleneck”: DNNs readily learn low- and high-order interactions but systematically suppress mid-order ones. Method: We propose a multi-order interaction quantification framework, supported by gradient-variance-based theoretical analysis and rigorous proof, to mechanistically explain the bottleneck’s origin; further, we design an interaction-order-aware loss function enabling controllable modulation of the interaction-order distribution. Contribution/Results: Experiments confirm the bottleneck’s ubiquity across CNNs, Transformers, and diverse image and NLP tasks. By tuning interaction-order distributions, models gain customizable capabilities—enhancing generalization robustness via strengthened low-order interactions or improving structural modeling via amplified high-order interactions. Our work establishes interaction order as a novel paradigm for interpreting and steering deep representations.

Technology Category

Application Category

📝 Abstract
Understanding what kinds of cooperative structures deep neural networks (DNNs) can represent remains a fundamental yet insufficiently understood problem. In this work, we treat interactions as the fundamental units of such structure and investigate a largely unexplored question: how DNNs encode interactions under different levels of contextual complexity, and how these microscopic interaction patterns shape macroscopic representation capacity. To quantify this complexity, we use multi-order interactions [57], where each order reflects the amount of contextual information required to evaluate the joint interaction utility of a variable pair. This formulation enables a stratified analysis of cooperative patterns learned by DNNs. Building on this formulation, we develop a comprehensive study of interaction structure in DNNs. (i) We empirically discover a universal interaction bottleneck: across architectures and tasks, DNNs easily learn low-order and high-order interactions but consistently under-represent mid-order ones. (ii) We theoretically explain this bottleneck by proving that mid-order interactions incur the highest contextual variability, yielding large gradient variance and making them intrinsically difficult to learn. (iii) We further modulate the bottleneck by introducing losses that steer models toward emphasizing interactions of selected orders. Finally, we connect microscopic interaction structures with macroscopic representational behavior: low-order-emphasized models exhibit stronger generalization and robustness, whereas high-order-emphasized models demonstrate greater structural modeling and fitting capability. Together, these results uncover an inherent representational bias in modern DNNs and establish interaction order as a powerful lens for interpreting and guiding deep representations.
Problem

Research questions and friction points this paper is trying to address.

Discovering universal interaction bottleneck in DNNs across tasks
Explaining mid-order interactions' high contextual variability theoretically
Modulating bottleneck via losses to steer interaction order emphasis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-order interactions quantify contextual complexity in DNNs
Loss modulation steers models to emphasize selected interaction orders
Interaction bottleneck explains under-representation of mid-order patterns
🔎 Similar Papers
No similar papers found.
H
Huiqi Deng
Xi’an Jiaotong University, Xi’an, China
Qihan Ren
Qihan Ren
Shanghai Jiao Tong University
Explainable AIMachine LearningComputer VisionNatural Language Processing
Z
Zhuofan Chen
Xi’an Jiaotong University, Xi’an, China
Z
Zhenyuan Cui
Xi’an Jiaotong University, Xi’an, China
W
Wen Shen
Tongji University, Shanghai, China
P
Peng Zhang
Xi’an Jiaotong University, Xi’an, China
Hongbin Pei
Hongbin Pei
Xi'an Jiaotong University
Machine learningData miningGraph-structured dataComplex network
Quanshi Zhang
Quanshi Zhang
Shanghai Jiao Tong University
Interpretable Machine Learning