Cluster-wise Graph Transformer with Dual-granularity Kernelized Attention

📅 2024-10-09

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

In graph learning, conventional graph coarsening methods often induce node feature homogenization and loss of fine-grained structural details. To address this, we propose a hierarchical Graph Transformer architecture that treats node clusters as fundamental tokens—bypassing explicit coarsening and thereby preserving information integrity. Our core innovation is the Node-to-Cluster Attention (N2C-Attn) mechanism, which integrates multi-kernel learning with kernelized attention to enable dynamic, bidirectional coupling between node-level and cluster-level representations. Additionally, we introduce cluster-level message passing and a linear-complexity self-attention design to ensure scalability. Extensive experiments on multiple graph-level benchmark tasks demonstrate significant improvements over state-of-the-art methods, achieving superior modeling expressiveness without compromising computational efficiency. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract

In the realm of graph learning, there is a category of methods that conceptualize graphs as hierarchical structures, utilizing node clustering to capture broader structural information. While generally effective, these methods often rely on a fixed graph coarsening routine, leading to overly homogeneous cluster representations and loss of node-level information. In this paper, we envision the graph as a network of interconnected node sets without compressing each cluster into a single embedding. To enable effective information transfer among these node sets, we propose the Node-to-Cluster Attention (N2C-Attn) mechanism. N2C-Attn incorporates techniques from Multiple Kernel Learning into the kernelized attention framework, effectively capturing information at both node and cluster levels. We then devise an efficient form for N2C-Attn using the cluster-wise message-passing framework, achieving linear time complexity. We further analyze how N2C-Attn combines bi-level feature maps of queries and keys, demonstrating its capability to merge dual-granularity information. The resulting architecture, Cluster-wise Graph Transformer (Cluster-GT), which uses node clusters as tokens and employs our proposed N2C-Attn module, shows superior performance on various graph-level tasks. Code is available at https://github.com/LUMIA-Group/Cluster-wise-Graph-Transformer.

Problem

Research questions and friction points this paper is trying to address.

Graph Learning

Node Homogenization

Detail Loss

Innovation

Methods, ideas, or system contributions that make the work stand out.

N2C-Attn

Cluster-GT

Multi-kernel Learning

🔎 Similar Papers

No similar papers found.