gLSTM: Mitigating Over-Squashing by Increasing Storage Capacity

📅 2025-10-09

📈 Citations: 0

✨ Influential: 0

career value

261K/year

🤖 AI Summary

Deep graph neural networks (GNNs) suffer from *over-compression* in deep message passing: information from large receptive fields is forced into fixed-dimensional node representations, creating an information bottleneck that impairs modeling of long-range dependencies. This work reframes over-compression through the lens of *memory capacity*, proposing gLSTM—a novel message-passing architecture inspired by sequence modeling. gLSTM integrates associative memory, fast weight programming, and xLSTM-style gating to substantially increase both the information density and retrievability of node representations. We introduce synthetic capacity benchmark tasks to quantitatively evaluate memory capability and validate gLSTM on multiple real-world graph datasets. Results demonstrate that gLSTM consistently outperforms state-of-the-art GNNs while maintaining computational efficiency—particularly excelling on tasks requiring long-range dependency modeling.

Technology Category

Application Category

📝 Abstract

Graph Neural Networks (GNNs) leverage the graph structure to transmit information between nodes, typically through the message-passing mechanism. While these models have found a wide variety of applications, they are known to suffer from over-squashing, where information from a large receptive field of node representations is collapsed into a single fixed sized vector, resulting in an information bottleneck. In this paper, we re-examine the over-squashing phenomenon through the lens of model storage and retrieval capacity, which we define as the amount of information that can be stored in a node's representation for later use. We study some of the limitations of existing tasks used to measure over-squashing and introduce a new synthetic task to demonstrate that an information bottleneck can saturate this capacity. Furthermore, we adapt ideas from the sequence modeling literature on associative memories, fast weight programmers, and the xLSTM model to develop a novel GNN architecture with improved capacity. We demonstrate strong performance of this architecture both on our capacity synthetic task, as well as a range of real-world graph benchmarks.

Problem

Research questions and friction points this paper is trying to address.

Addressing over-squashing in GNNs by enhancing node storage capacity

Developing a novel GNN architecture with improved information retention

Mitigating information bottleneck in graph neural networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Increased storage capacity in GNN nodes

Adapted xLSTM associative memory techniques

Novel architecture mitigates over-squashing bottlenecks

🔎 Similar Papers

Position IDs Matter: An Enhanced Position Layout for Efficient Context Compression in Large Language Models