🤖 AI Summary
In graph neural networks (GNNs), conventional message-passing and global attention mechanisms often conflate node-specific features with neighborhood- or graph-level contextual information due to simplistic aggregation (e.g., summation), leading to representational degradation—such as information attenuation and diminished discriminability—under deep stacking. To address this, we propose a differential encoding mechanism that explicitly models the discrepancy between a node’s representation and its aggregated neighborhood (or global graph) representation as an independent signal. This differential signal is then adaptively fused into the node update via learnable weights, effectively decoupling self-information from contextual information and mitigating information loss. Our method seamlessly integrates into both message-passing and global attention frameworks and supports end-to-end training. Evaluated on seven benchmark datasets, it achieves state-of-the-art performance on both node classification and graph classification tasks, demonstrating substantial gains in representation robustness and discriminability.
📝 Abstract
Combining the message-passing paradigm with the global attention mechanism has emerged as an effective framework for learning over graphs. The message-passing paradigm and the global attention mechanism fundamentally generate node embeddings based on information aggregated from a node's local neighborhood or from the whole graph. The most basic and commonly used aggregation approach is to take the sum of information from a node's local neighbourhood or from the whole graph. However, it is unknown if the dominant information is from a node itself or from the node's neighbours (or the rest of the graph nodes). Therefore, there exists information lost at each layer of embedding generation, and this information lost could be accumulated and become more serious when more layers are used in the model. In this paper, we present a differential encoding method to address the issue of information lost. The idea of our method is to encode the differential representation between the information from a node's neighbours (or the rest of the graph nodes) and that from the node itself. The obtained differential encoding is then combined with the original aggregated local or global representation to generate the updated node embedding. By integrating differential encodings, the representational ability of generated node embeddings is improved. The differential encoding method is empirically evaluated on different graph tasks on seven benchmark datasets. The results show that it is a general method that improves the message-passing update and the global attention update, advancing the state-of-the-art performance for graph representation learning on these datasets.