Cluster-wise Graph Transformer with Dual-granularity Kernelized Attention

๐Ÿ“… 2024-10-09
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
In graph learning, conventional graph coarsening methods often induce node feature homogenization and loss of fine-grained structural details. To address this, we propose a hierarchical Graph Transformer architecture that treats node clusters as fundamental tokensโ€”bypassing explicit coarsening and thereby preserving information integrity. Our core innovation is the Node-to-Cluster Attention (N2C-Attn) mechanism, which integrates multi-kernel learning with kernelized attention to enable dynamic, bidirectional coupling between node-level and cluster-level representations. Additionally, we introduce cluster-level message passing and a linear-complexity self-attention design to ensure scalability. Extensive experiments on multiple graph-level benchmark tasks demonstrate significant improvements over state-of-the-art methods, achieving superior modeling expressiveness without compromising computational efficiency. The implementation is publicly available.

Technology Category

Application Category

๐Ÿ“ Abstract
In the realm of graph learning, there is a category of methods that conceptualize graphs as hierarchical structures, utilizing node clustering to capture broader structural information. While generally effective, these methods often rely on a fixed graph coarsening routine, leading to overly homogeneous cluster representations and loss of node-level information. In this paper, we envision the graph as a network of interconnected node sets without compressing each cluster into a single embedding. To enable effective information transfer among these node sets, we propose the Node-to-Cluster Attention (N2C-Attn) mechanism. N2C-Attn incorporates techniques from Multiple Kernel Learning into the kernelized attention framework, effectively capturing information at both node and cluster levels. We then devise an efficient form for N2C-Attn using the cluster-wise message-passing framework, achieving linear time complexity. We further analyze how N2C-Attn combines bi-level feature maps of queries and keys, demonstrating its capability to merge dual-granularity information. The resulting architecture, Cluster-wise Graph Transformer (Cluster-GT), which uses node clusters as tokens and employs our proposed N2C-Attn module, shows superior performance on various graph-level tasks. Code is available at https://github.com/LUMIA-Group/Cluster-wise-Graph-Transformer.
Problem

Research questions and friction points this paper is trying to address.

Graph Learning
Node Homogenization
Detail Loss
Innovation

Methods, ideas, or system contributions that make the work stand out.

N2C-Attn
Cluster-GT
Multi-kernel Learning
๐Ÿ”Ž Similar Papers
No similar papers found.
S
Siyuan Huang
LUMIA Lab, Shanghai Jiao Tong University; Paris Elite Institute of Technology, Shanghai Jiao Tong University
Yunchong Song
Yunchong Song
Ph.D. student, Shanghai Jiao Tong University
Machine Learning
J
Jiayue Zhou
Paris Elite Institute of Technology, Shanghai Jiao Tong University
Z
Zhouhan Lin
LUMIA Lab, Shanghai Jiao Tong University