🤖 AI Summary
Graph Neural Networks (GNNs) suffer from over-squashing—severe distortion or loss of multi-hop neighborhood information during aggregation—largely due to heuristic choices of hidden dimensionality and propagation depth.
Method: This paper introduces the first information-theoretic framework modeling spectral GNNs as bandwidth-limited communication channels, formalizing joint width-and-depth optimization as a nonlinear programming problem constrained by channel capacity. It theoretically derives and differentiably estimates channel capacity to quantify information compression, enabling adaptive architectural configuration.
Contribution/Results: The framework reveals a fundamental trade-off between information compression rate and representation capability. Evaluated on nine standard graph learning benchmarks, the estimated architecture parameters effectively mitigate over-squashing, yielding average improvements of 2.1–4.7 percentage points in node and graph classification accuracy.
📝 Abstract
Existing graph neural networks typically rely on heuristic choices for hidden dimensions and propagation depths, which often lead to severe information loss during propagation, known as over-squashing. To address this issue, we propose Channel Capacity Constrained Estimation (C3E), a novel framework that formulates the selection of hidden dimensions and depth as a nonlinear programming problem grounded in information theory. Through modeling spectral graph neural networks as communication channels, our approach directly connects channel capacity to hidden dimensions, propagation depth, propagation mechanism, and graph structure. Extensive experiments on nine public datasets demonstrate that hidden dimensions and depths estimated by C3E can mitigate over-squashing and consistently improve representation learning. Experimental results show that over-squashing occurs due to the cumulative compression of information in representation matrices. Furthermore, our findings show that increasing hidden dimensions indeed mitigate information compression, while the role of propagation depth is more nuanced, uncovering a fundamental balance between information compression and representation complexity.