Graph Triple Attention Network: A Decoupled Perspective

📅 2024-08-14
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Existing graph transformers (GTs) suffer from two key bottlenecks: (1) entanglement of multi-source information—positional, structural, and attribute features—leading to poor interpretability and inflexible design; and (2) conflation of local message passing with global self-attention, causing overfitting and weakened local feature representation. To address these issues, we propose the Graph Attention Tripartite Decoupling (GATD) framework, the first to orthogonally decompose self-attention into three dedicated modules—positional, structural, and attribute—while hierarchically modeling local propagation and global interaction. Our modular computation and adaptive fusion mechanism enable multi-view decoupled representation learning and dynamic local–global coordination. GATD achieves state-of-the-art performance on node and graph classification across multiple benchmark datasets. Ablation studies confirm that the tripartite decoupling significantly enhances model interpretability, generalization, and architectural flexibility.

Technology Category

Application Category

📝 Abstract
Graph Transformers (GTs) have recently achieved significant success in the graph domain by effectively capturing both long-range dependencies and graph inductive biases. However, these methods face two primary challenges: (1) multi-view chaos, which results from coupling multi-view information (positional, structural, attribute), thereby impeding flexible usage and the interpretability of the propagation process. (2) local-global chaos, which arises from coupling local message passing with global attention, leading to issues of overfitting and over-globalizing. To address these challenges, we propose a high-level decoupled perspective of GTs, breaking them down into three components and two interaction levels: positional attention, structural attention, and attribute attention, alongside local and global interaction. Based on this decoupled perspective, we design a decoupled graph triple attention network named DeGTA, which separately computes multi-view attentions and adaptively integrates multi-view local and global information. This approach offers three key advantages: enhanced interpretability, flexible design, and adaptive integration of local and global information. Through extensive experiments, DeGTA achieves state-of-the-art performance across various datasets and tasks, including node classification and graph classification. Comprehensive ablation studies demonstrate that decoupling is essential for improving performance and enhancing interpretability. Our code is available at: https://github.com/wangxiaotang0906/DeGTA
Problem

Research questions and friction points this paper is trying to address.

Graph Transformers
Information Entanglement
Local and Global Context Mixing
Innovation

Methods, ideas, or system contributions that make the work stand out.

DeGTA Network
Multi-Attention Mechanism
Local and Global Information Integration
🔎 Similar Papers
No similar papers found.
Xiaotang Wang
Xiaotang Wang
The Hong Kong University of Science and Technology (Guangzhou)
GNNGraph TransformerAI4Science
Y
Yun Zhu
Zhejiang University, Hangzhou, China
Haizhou Shi
Haizhou Shi
Ph.D at Rutgers University
Bayesian Deep LearningContinual Learning
Y
Yongchao Liu
Ant Group, Hangzhou, China
C
Chuntao Hong
Ant Group, Hangzhou, China